Page tree

Query guide - Stereochemistry Contents

Stereochemistry

Biological systems are highly stereoselective, thus, a chemical structure search engine has to be geared with stereospecific query tools. JChem handles tetrahedral, double bond E/Z stereochemistry, isomers of cumulenes, atropisomers, and syn-anti/endo-exo isomers of bridged bicyclic compounds. Furthermore, relative tetrahedral stereo configuration and different stereo models can be used. There are also various search options that modify search behavior related to stereochemistry.

The following stereo search options are available:

Stereospecific

    • When the query does not contain stereo information, the hits will include results both with and without stereo information. Otherwise, the stereo information is taken into account during the search. Default in case of search types other than duplicate. The following stereo search options can modify the default setting.

Exact stereo

    • All stereo information is tested for equality, meaning that a non-stereo query only matches non-stereo targets and enhanced stereo groups can't match on absolute stereo atoms. Default in case of duplicate search.

Diastereomer searching

    • Retrieves stereoisomers where tetrahedral stereo information is present on the same stereo centers, but their configuration (parity) is arbitrary.

Enantiomer searching

    • Retrieves the given query stereo configuration and its enantiomer as well. This options works in the same way as if all atoms without enhanced stereo information belonged to a further AND group (see priority).

Ignored stereo

    • All stereo information is ignored.

Tetrahedral stereochemistry

Tetrahedral stereochemistry information is derived from different molecular features, depending on dimensionality:

  • 3D: When the molecule contains 3D coordinates, those alone define tetrahedral stereochemistry.
  • 2D: For 2D structure diagrams, coordinates and wedge, hatch or wiggly bond types define stereochemistry. These bond types define the relative position of ligands to the stereo center atom, which should be at the narrow end by default:

up ( wedge bond ) the ligand on the wide end is above the atom at the narrow end

down ( hatch bond ) atom on the wide end is below the atom at the narrow end

up or down ( wiggly bond ) specifies tetrahedral chirality information, but the actual stereo configuration is irrelevant ; or

cis/trans configuration of double bond is irrelevant (see below)

Non-stereo (plain) bonds are assumed to be in the plane of the paper. When the stereo center has an implicit hydrogen also, it is assumed to point downwards.

Table 1. depicts a few examples of tetrahedral stereo matching, assuming absolute stereochemistry (see next section for further details).

Table 1. Tetrahedral stereo matching

 

target

query

 

Exact stereo search

The exact stereo option means that all stereo information should be the same in the query and target ("all stereo info is exactly the same"). It mainly has an effect when the query has no stereo information: it only matches non-stereo target. Similarly, a query with a wiggly tetrahedral center will only match wiggly tetrahedral center, and not specific R and S configurations.

Table 2. Exact stereo matching

 

target

query

Effect of tautomer search options on tetrahedral stereo matching

Tautomer search options have effects on tetrahedral stereo matching in the tautomer regions.

In case of 'tautomer search on' the tautomer forms of the query and the target are taken into account. All tetrahedral stereo centers have to stereo match according to the applied stereo search type if the tetrahedral stereo center is present in both tautomer forms (query and target). On the other hand, it is not enforced if the tetrahedral stereo arrangement in the tautomer region is lost (changed to sp2 hybridized, planar arrangement).

In case of 'tautomer search on with ignore stereo information in tautomer region' option, the tetrahedral centers only outside the tautomer regions have to stereo match according to the applied stereo search type. The stereo information of tetrahedral stereo centers in the tautomer regions are ignored. In comparison with 'tautomer search on' those molecules are also found whose tetrahedral centers do not stereo match (according to the applied stereo search type) the tetrahedral centers of the query in the tautomer regions.

See examples on tautomer search option's page.

E/Z stereochemistry of double bonds


Ligand pairs of a stereo double bond define a stereo configuration. (Referred to as cis/trans or E/Z configuration.) In 2D and 3D molecules this configuration is derived from the atomic coordinates, and for molecules without coordinates (0D, like smiles) stereo double bonds are distinguished in other ways. (For example, smiles uses the directional bonds: / and \ for the ligand, and CML and MRV formats use a bond flag in the 0D case - the <bondStereo> tag.)

There is a search option which controls the behavior regarding double bond cis/trans isomerism: setDoubleBondStereoMatchingMode() . It can set three different search states:

  • DBS_NONE: No double bond cis/trans is considered.
  • DBS_MARKED: (Default) Double bond cis/trans stereo is checked for double bonds designated by the stereo search flag only. However, for queries coming from the Daylight formats family, all double bonds are considered which have specified stereo configuration. See section below about the matching differences between different formats.
  • DBS_ALL: All double bonds are checked for cis/trans stereo matching.

In case of DBS_MARKED, a small box should be placed on the query double bond to indicate the stereo search flag. This means that those double bonds will be considered as stereo during the search. In this case, the corresponding double bond in the target molecule structure must have the same stereo configuration as drawn in the query (Table 3.).

cis (the two atoms are on the same side of the double bond)

trans (the two atoms are on the opposite sides of the double bond)

cis or trans (stereo bond with either cis or trans configuration)

cis or trans (stereo bond with either cis or trans configuration)

not trans

not cis

Table 3. Stereo double bonds

Examples(DBS_MARKED):

 

target

query

Special types of E/Z stereoisomerism

Double bond stereo matching near aromatic rings

In some structures in case of general aromatization, aromatic rings can have double bonds (see aromatization documentation). In such cases double bond, cis - trans stereo information is considered during searching.
Example:

 

target

query

Cumulated system with odd numbers of double bonds and/or rings

This search option handles cis-trans stereo isomerism of cumulenes with odd numbered double bonds (at least 3 double bonds). Furthermore, it handles cis-trans stereo isomerism of rings where the count of spiro arranged rings is also an odd number (1, 3, ...). Combination of double bonds and rings are also taken into account if the sum of double bond(s) and ring(s) gives an odd number (at least 3). Exploring stereoisomers of cumulenes the following issues should be considered.

  • Cumulene or ring cis-trans stereo search option is switched off by default.
  • The stereochemical information of cumulenes cannot be exported in 0D.
  • Global stereo model is used exclusively during Cumulene or ring cis-trans stereo search tasks, even if a different model has been selected for the search.
  • Works for substructure, duplicate and full search methods.
  • Exact Cumulene or ring cis-trans stereo search is available only in case of full, full fragment, or duplicate search.
  • This option is not available for markush structures.

Example:

 

Target

Query

 

Target

Query

 

Target

Query

 

Effect of tautomer search options on E/Z stereo matching

Tautomer search option has effects on E/Z stereo matching in the tautomer regions.

In case of 'tautomer search on' the tautomer forms of the query and the target are taken into account. All E/Z stereo bonds have to stereo match according to the applied double bond stereo search type option if both tautomer forms (query and target) contain the appropriate double bond. In addition, those tautomers of the query are also found where the respective E/Z stereo double bond has changed to non-double bond.

In case of 'tautomer search on with ignore stereo information in tautomer region' option, the E/Z stereo bonds outside the tautomer regions have to stereo match according to the applied double bond stereo search type, and the stereo information of E/Z stereo bonds in the tautomer regions are ignored. In comparison with 'tautomer search on' those targets are also found whose E/Z stereo double bonds do not stereo match (according to the applied double bond stereo search type) in the tautomer regions.

See examples on tautomer search option's page.

Axial isomerism

Axial chirality is a special case of stereochemistry and typically observed in biaryl compounds and cumulenes with different substituents and even numbers of double bonds. The following issues should be credited during seeking axial stereoisomers.

  • Axial stereo search option is switched off by default.
  • Works for substructure, duplicate and full search methods.
  • Global stereo model is used exclusively during Axial stereo search tasks, even if a different model has been selected for the search.
  • Exact Axial stereo search is available only in full, full fragment, or duplicate search case.
  • This option is not available for markush structures.

Axial chirality of biaryl compounds:

 

Target

Query

Axial chirality of cumulenes with different substituents and even numbers of double bonds:

 

Target

Query

 

Syn-anti, endo-exo isomerism of bicyclic molecules

Bicyclic molecules consist of two fused rings. In case of bridged bicyclic compounds the rings connect across a sequence of atoms. The substituents of any bicyclic ring systems are specified as syn , anti or endo , exo regarding their positions and relative orientation to the bridges of the molecule. If a group is attached to the highest numbered bridge and is orientated towards the lowest numbered bridge it is given the description syn ; if the group is orientated away from the lowest numbered bridge it is given the description anti .If a group is orientated towards the highest numbered bridge it is given the description exo ; if it is orientated away from the highest numbered bridge it is given the description endo .
Stereo search differentiates syn - anti and endo - exo isomers of a bicyclic molecule. The following subjects should be considered when exploring stereoisomers of bridged bicyclic compounds.

  • Syn - Anti stereo search option is switched off by default.
  • The stereochemical information can be exported in 2D and in CXSMILES.
  • This option can be used in case of full structure, substructure, similarity, and duplicate search types.
  • Local or global stereo model can be chosen as search strategy. Comprehensive model works equally to local stereo model in Syn- Anti stereo search option.
  • Exact Syn-Anti stereo search is available only in case of full, full fragment, or duplicate search.

 

Target

Query

 

Stereo models

Stereo models describe search strategies in cases of molecules with defined stereochemistry. However, no reliable stereochemistry can be evaluated in symmetric molecules, these models enable to indicate stereochemical information on symmetric molecules for more accurate search results.

Local stereo model

This search model considers local stereo information only (local parity, local double bond stereo configuration, etc.). In other words, it accepts all the given stereochemistry information, and does not check ligand equivalences, etc. This method only matches target atoms/bonds with the specified stereochemistry when a symmetric atom/bond in the query is specified.

Query

Target

Hit

 

Global stereo model

This search model considers global stereo information (global parity, global double bond stereo configuration, etc.). This means that marked stereo centers with symmetric ligands (both on the query and the target sides) are not credited to have stereo information at all. This mode is suitable for duplicate, full and full fragment searches, as in these cases the full stereospecific environment is always available for both the query and target structures. Therefore a symmetric atom/bond with stereo configuration may match an unspecified stereo atom/bond.
Asymmetric, specified query structures do not match asymmetric, but unspecified, and symmetric (specified or unspecified) target structures.

Query

Target

Hit

 

Comprehensive stereo model

It combines the advantages of local and global stereo models. This setting is suitable for all search types. Generally, it applies the local stereo model, but in case of symmetric ligands, global stereo model is considered. Symmetric targets match regardless of the presence of stereo information.
Asymmetric, specified query structures match symmetric (specified or unspecified) target structures.
Specified (asymmetric or symmetric) query structures do not match asymmetric, but unspecified target structures.

Query

Target

Hit

 

Auto stereo model

If no stereo model is specified then the applied stereo model is based on the search type/options and in case of database searches on the table type. The following table shows the stereo model based on these parameters.

 

markush search

No

Yes

search type

all search types

except duplicate

comprehensive

local

duplicate search

global

for query tables:local

local

Note: Duplicate search in query tables uses local stereo model in order to be able to distinguish between symmetric query structures with or without stereo information.
 

Matching rules for symmetric queries and/or targets.

In case of stereo searches a stereo query can match a target with the same stereo information, and non-stereo queries can match both stereo and non-stereo targets (except exact stereo search). However, in some cases even a stereo query can match non-stereo target depending on the applied stereo model and the symmetry. The following table presents the hit results assuming a query with stereo information, a target without stereo information, and assuming other criteria allow a match.

Query

Target

hit

query symmetry

example

target symmetry

example

local

global

comprehensive

N

N

N

Y

Y

N

Y

Y

  Relative configuration of tetrahedral stereo centers

In the case of stereogenic centers absolute and relative stereo configurations are both supported. We support both MDL stereo representations (chiral flag and enhanced stereo representation) and the Daylight stereo representation. All molecules originating from Daylight SMILES represent absolute stereo configuration, as SMILES does not support relative configuration.
For detailed explanation on the theory and examples for stereo representations please see MDL's Enhanced Stereochemical Representation and the Daylight Theory Manual.

MDL's Enhanced Stereo Representation

In MDL's enhanced stereo representation all stereo center atoms are labeled with one of the following:

  1. ABS
  2. ORn
  3. AND n

They define a grouping of the stereogenic centers.

Stereogenic centers belonging to ABS represent absolute stereochemistry, i.e. chirality. (All unlabeled stereo centers are also thought to belong to the ABS group by default. Unlabelled stereo centers may be interpreted as an independent AND group only if (1) chiral flag is not set AND (2) the absolute stereo search options (Query/TargetAbsoluteStereo, AbsoluteStereo) are set to false. See the following sections for further explanation.)

Stereogenic centers belonging to an ORn  group (e.g. OR1) represents one stereoisomer that is either the structure as drawn (R, S) OR the epimer in which the stereogenic centers have the opposite configuration (S, R).

Stereogenic centers belonging to an ANDn group (e.g. AND1) represents a mixture of two enantiomers: the structure as drawn AND the epimer in which the stereogenic centers have the opposite configuration. (Note, that it is not a racemic mixture, but a mixture of the enantiomers of any ratio. Of course, a 1:1 mixture (racemic mixture) is included in this sense.)

Table 4. Representation of stereo centers

molecule

interpretation

A pure sample of one stereoisomer:

A pure sample of one of these enantiomers:

 or 

A pure sample of one of these enantiomers:

 or 

A sample that is a mixture of the two enantiomers:

 and 

A pure sample of one of these diastereomers:

 or  or  or 

Matching rules of the enhanced representation

  • ABS only matches ABS ( S to S and R to R ).
  • OR matches ABS and OR labels.
  • AND matches all labels (ABS, OR and AND).
  • AND and OR groups match both S and R , but the relative configuration must match. (See Table 6. below.)

Table 5. Matching rules of stereo centers

   target

(No stereo info)

 query

(No stereo info)

 

Table 6. Matching rules of down wedge query bonds

   target

(No stereo info)

 query

 

For AND and OR groups the relative configuration of the group must match: (i.e. All match as drawn or all match the opposite way.) There are no restrictions when the chiral centers belong to different groups (see bottom row in Table 7. below).

Table 7.

   target
  query


MDL old stereo representation (chiral flag)


In MDL's original stereochemistry representation conventions, a structure with a chiral flag implies that all stereocenters marked with wedge bonds have an absolute configuration ( R or S ) thus a single isomer is present.
No chiral flag set means that only the relation is known between the tetrahedral stereo centers. The stereocenter could be either of the two stereomers (as drawn and the mirror image) or a mixture of the two.

In JChem the chiral flag representation of MDL is not considered by default and all molecules are treated as chiral:

Table 8.

   target
 query


 

However, when the absolute stereo options (Query/TargetAbsoluteStereo, AbsoluteStereo) are set to false, the Chiral flags in MDL molfiles and sdfiles are considered. In this case, molecules lacking the chiral flag are considered as if their unlabeled stereogenic centers were in an AND group hence expressing relative stereo configuration:

Table 9.

   target

Chiral

Chiral

Chiral

Chiral

 

query

Chiral

 

 Priority list of checking stereoinformation


A query filters information on stereocenters in the following order:

  1. Enhanced stereo representation
  2. absoluteStereo parameters ('Assume absolute stereo flag' option at table creation)
    • turned on (default setting): the molecule is assumed to have absolute labels on all stereocenters.
    • turned off: chiral flag check follows.
  3. Chiral flag
    • present: all stereo centers are considered as absolute.
    • absent: all unlabelled stereocenters are assumed to belong to one AND group.