Stereochemistry

Biological systems are highly stereoselective, thus, a chemical structure search engine has to be geared with stereospecific query tools. JChem handles tetrahedral and double bond E/Z stereochemistry. Furthermore, relative tetrahedral stereo configuration and different stereo models can be used. There are also various search options that modify search beaviour related to stereochemistry.

When the query does not contain stereo information, the hits will include results both with and without stereo information. Otherwise, the stereo information is taken into account during the search.

Search options may modify the above behaviour. The following stereo search options are available:

Tetrahedral stereochemistry

Tetrahedral stereochemistry information is derived from different molecular features, depending on dimensionality:

Table 1. depicts a few examples of tetrahedral stereo matching, assuming absolute stereochemistry (see next section for further details).

Table 1. Tetrahedral stereo matching

  target
query

Relative configuration of tetrahedral stereo centers

In the case of stereogenic centers absolute and relative stereo configurations are both supported. We support both MDL stereo representations (chiral flag and enhanced stereo representation) and the Daylight stereo representation. All molecules originating from Daylight SMILES represent absolute stereo configuration, as SMILES does not support relative configuration.

For detailed explanation on the theory and examples for stereo representations please see MDL's Enhanced Stereochemical Representation and the Daylight Theory Manual.

MDL's Enhanced Stereo Representation

In MDL's enhanced stereo representation all stereo center atoms are labeled with one of the following:
  1. ABS
  2. ORn
  3. ANDn
They define a grouping of the stereogenic centers.

Stereogenic centers belonging to ABS represent absolute stereochemistry, i.e. chirality. (All unlabeled stereo centers are also thought to belong to the ABS group by default. Unlabelled stereo centers may be interpreted as an independent AND group only if (1) chiral flag is not set AND (2) the absolute stereo search options (Query/TargetAbsoluteStereo, AbsoluteStereo) are set to false. See the following sections for further explanation.)

Stereogenic centers belonging to an ORn group (e.g. OR1) represents one stereoisomer that is either the structure as drawn (R, S) OR the epimer in which the stereogenic centers have the opposite configuration (S, R).

Stereogenic centers belonging to an ANDn group (e.g. AND1) represents a mixture of two enantiomers: the structure as drawn AND the epimer in which the stereogenic centers have the opposite configuration. (Note, that it is not a racemic mixture, but a mixture of the enantiomers of any ratio. Of course, a 1:1 mixture (racemic mixture) is included in this sense.)

Table 2. Representation of stereo centers

molecule interpretation
A pure sample of one stereoisomer:
A pure sample of one of these enantiomers:
or
A pure sample of one of these enantiomers:
or
A sample that is a mixture of the two enantiomers:
and
A pure sample of one of these diastereomers:
or or
or

Matching rules of the enhanced representation

Table 3. Matching rules of stereo centers

  target


(No stereo info)
query


(No stereo info)

Table 4. Matching rules of down wedge query bonds

  target


(No stereo info)
query


For AND and OR groups the relative configuration of the group must match: (i.e. All match as drawn or all match the opposite way.) There are no restrictions when the chiral centers belong to different groups (see bottom row in tha table below).

Table 5.

  target
query

MDL old stereo representation (chiral flag)

In MDL's original stereochemistry representation conventions, a structure with a chiral flag implies that all stereocenters marked with wedge bonds have an absolute configuration (R or S) thus a single isomer is present.

In JChem the chiral flag representation of MDL is not considered by default and all molecules are treated as chiral:

Table 6.

  target
query

However, when the absolute stereo options (Query/TargetAbsoluteStereo, AbsoluteStereo) are set to false, the Chiral flags in MDL molfiles and sdfiles are considered. In this case, molecules lacking the chiral flag are considered as if their unlabeled stereogenic centers were in an AND group hence expressing relative stereo configuration:

Table 7.

  target

Chiral

Chiral

Chiral

Chiral
query
Chiral

Priority list of checking stereoinformation

A query filters information on stereocenters in the following order:

  1. Enhanced stereo representation
  2. absoluteStereo parameters ('Assume absolute stereo flag' option at table creation)
    • turned on (default setting): the molecule is assumed to have absolute labels on all stereocenters.
    • turned off: chiral flag check follows.
  3. Chiral flag
    • present: all stereo centers are considered as absolute.
    • absent: all unlabelled stereocenters are assumed to belong to one AND group.

E/Z stereochemistry of double bonds

Ligand pairs of a stereo double bond define a stereo configuration. (Referred to as cis/trans or E/Z configuration.) In 2D and 3D molecules this configuration is derived from the atomic coordinates, and for molecules without coordinates (0D, like smiles) stereo double bonds are distinguished in other ways. (For example, smiles uses the directional bonds: / and \ for the ligand, and CML and MRV formats use a bond flag in the 0D case - the <bondStereo> tag.)

There is a search option which controls the behaviour regarding double bond cis/trans isomerism: setDoubleBondStereoMatchingMode(). It can set three different search states:

In case of DBS_MARKED, a small box should be placed on the query double bond to indicate the stereo search flag. This means that those double bonds will be considered as stereo during the search. In this case, the corresponding double bond in the target molecule structure must have the same stereo configuration as drawn in the query (Table 8.).

cis (the two atoms are on the same side of the double bond)
trans (the two atoms are on the opposite sides of the double bond)
cis or trans (stereo bond with either cis or trans configuration)
cis or trans (stereo bond with either cis or trans configuration)
not trans
not cis

Table 8. Stereo double bonds

Examples(DBS_MARKED):

  target
query

Stereo models

The stereo model defines how stereochemistry should be evaluated at symmetrical parts of the query and target, with stereochemistry possibly incompletely defined.

Local stereo model

This method is the default. It instructs the search to use local stereo information only (local parity, local double bond stereo configuration, etc). In other words it accepts all given stereochemistry information, and does not check ligand equivalences, etc. When a symmetric atom/bond in the query is specified, this method only matches target atoms/bonds with specified stereochemistry. (For example, query C[C@](C)(C)C does not match CC(C)(C)C.)

Global stereo model

It instructs the search to use global stereo information (global parity, global double bond stereo configuration, etc). This means that stereo centers with symmetrical ligands are assumed to have no stereo information at all, both in the query and target. This value is suitable for Perfect, exact and exact fragment searches, as in these cases the full stereospecific environment is always available for both the query and target structures. (Therefore a symmetrical atom/bond with stereo configuration may match to an unspecified strereo atom/bond. For example, C[C@](C)(C)C matches CC(C)(C)C.)

Comprehensive stereo model

It combines the advantages of local and global stereo models. This setting is suitable for all search types. In principle, it is similar to the local stereo model, except when the target is symmetrical. In the symmetrical target case the matching is accepted, regardless of stereo information. For example, substructure query C[C@](C)(C)C matches both CC(C)(C)C , C[C@@](C)(C)C and CCC[C@@](C)(CC)C(C)C.

Back to index page