Atom lists proved to be useful tools for creating query structures with variable atoms. JChem provides a similar variability feature of functional groups or other substructures in queries for molecule or reaction targets (tables) through the use of R-groups.
If you are interested in searching combinatorial Markush library targets (tables) described by R-group notation, see this following section.
An R-group query structure consists of three components, a root structure (often referred to as scaffold), a set of R-group definitions and R-group conditions. The root structure contains the portion of the query structure that does not vary among the structures retrieved. R-groups (or R-groups) are attached as substituents to the root and their sites are marked with R1, R2, R3, etc. symbols. It is possible to attach multiple R-groups to one root, even to a single atom of the structure. One R-group can be attached multiple times to the same root, but it does not mean that all these attachments should be filled by the same definition (see occurrence conditions below for further information).
![]() |
![]() |
![]() |
R-group definitions are variable lists of ligands connected to specific positions of the root structure by their attachment points.
![]() |
![]() |
The occurrence condition defines the number of R-group sites to be
occupied. For example, occurrence designation >0 for R1 specifies that
the target molecule must contain at least one of the R1 substituents listed in
the R1 group definition on its corresponding atom. This is the default value of
the occurrence.
The following specifications are valid:
The occurrence can be specified as a comma separated list, there is OR relation
between the elements. Example: "0,2-5,>6" means the specified
R-group may occur zero, two to five, or more than six times.
If conditions for an R-group have to be satisfied only when conditions of another
R-group are satisfied, use the If/Then conditions. For example,
If R1 then R2 means that if the conditions for R1 are satisfied,
then the conditions for R2 must also be satisfied. If the conditions for R1 are
not satisfied, the conditions for R2 are ignored. This If/Then condition implies
that the molecule may be retrieved even though R1 is not satisfied.
If the RestH condition is set for an R-group, the hit molecules do not contain ligands on that atom other than hydrogen or those specified as R-group. If RestH condition is false, then R-group sites can contain any additional non-hydrogen ligands as well.
Table 1. R-group query structures
| target | ||||
![]() |
||||
| query | ![]() |
default | ||
| R1>1 | ||||
| R2>1 | ||||
| RestH on R1 |
||||
![]() |
default | |||
| if R1 then R3 |
||||
| if R2 then R3 |
||||
Certain combinatorial libraries can be described by Markush structures that contain generic features to express variable structural features. The library of a Markush structure is the total set of specific molecules that are described by the Markush structure.
JChem allows searching in combinatorial libraries described as Markush structures, without the need to explicitly enumerate all molecules of the Markush library. The searching can handle the same generic features as the Markush Enumeration Plugin.
R-groups (also referred to as "substituent variation") are the most widely known Markush generic features. The variable part of the structure is denoted by an R-atom (eg. R1), and the definitions are given separately. In each definition the connection points must be defined to show where the bonds of the R-atom are linked. R-atoms can appear in both rings and chains and can have up to two attachments points. The same R-atom can appear multiple times, and the different occurrences are handled as different cases. (So they can be substituted with different definitions.) R-group nesting in R-group definitions is allowed to any depth, but without recursion. (An R-group definition cannot use the R-atom it is defining, not even through the use of other embedding R-atom(s).) R-groups up to number R32767 can be used.
| Example | Example Markush library member |
|---|---|
|
|
Atom lists are another example of substituent variation. They define lists of atom types at a given position. There is no restriction for the length of the list and for bond count of atom lists.
| Example | Example Markush library member |
|---|---|
|
|
The following bond lists (generic bond types) are supported: single or double, any (single, double or triple), single or aromatic, double or aromatic. The any bond implicitly can also match aromatic bonds, when it is part of a potentially aromatic system. See: Markush aromatization.
| Example | Example Markush library member |
|---|---|
|
Link nodes are atoms that may repeat between two of their designated bonds (called outer bonds, denoted by brackets). All other substituents (if exist) repeat together with the atom. In the results, the new bonds between the repeating atoms will have the bond type of the lower order outer bond.
| Example | Example Markush library member |
|---|---|
|
|
Position variation bonds are bonds attached to variable atoms at one or both end positions. The set of variable atoms is drawn as a multicenter group. A position variation bond connects one atom from one end position to one atom from the other end position. If the end position is a single atom then the bond is attached to this atom, if the end position is a multicenter group then the bond is attached to an arbitrary member of the group.
Limitations:
If a link node is a member of a multicenter group then the group will
include the repeated atoms as well in case when the original multicenter group contains no more atoms from the
link fragment, otherwise the position variation bond is part of the link fragment and repeated together with the
link node (see example structures). If the position variation bond is part of the link fragment the multicenter group
can have atoms only within the link fragment and the link node atom.
| Example | Example Markush library member |
|---|---|
|
|
Table 2. Simple substructure search examples (The bond denoted by dots is an Any bond: single or double or triple.)
| target | ||||
![]() |
![]() |
![]() |
||
|
substructure query |
||||
![]() |
||||
Table 3. Simple exact structure search examples (The bond denoted by dots is an Any bond: single or double or triple.)
| target | ||||
![]() |
![]() |
![]() |
||
|
exact structure query |
![]() |
|||
![]() |
||||
Table 4. Exact structure search examples where explicit Hydrogens are not considered.
| target | |||
![]() |
![]() |
||
|
exact structure query |
![]() |
||
When a query matches a Markush structure, there are different ways of displaying the hit. One possibility is to color the matching parts of the original Markush structure, but it may mean that the highlighting is spread across different fragments (R-group definitions) when the query overlaps variable parts. Markush structure reduction is a technique wherein the variable parts overlapping the hit are expanded (substituted with the appropriate specific definition). This way the hit highlighting is always visible as a whole and part of the scaffold. (Note that the resulting structure of Markush structure reduction may still contain generic features.)
Table 5. Structure reduction
| target | |||
| Hit coloring in original Markush structure | Markush structure reduction to the hit | ||
|
substructure query |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
|
With the introduction of generic notation in target structures, it is possible to formulate ring systems with ambiguous aromaticity status: some enumerations of the ring are aromatic, and others are not. See a simple example below.
![]() |
Therefore, in case of Markush targets, it is not possible to entirely separate standardization and searching the way as described in section Standardization. Instead, aromaticity is handled in a more complex way that ensures that no matching is lost. (However, there may be false positives in case the query is not matching a full ring. See examples below.)
Standardization for Markush targets (tables) solely consists of a special aromatization method: Markush aromatization. It divides rings of the Markush structure with generic features into three different categories:
Searching considers aromatic and nonaromatic rings the same way as for specific structures. However, ambiguous rings have a two-step processing:
Table 6. Aromaticity in Markush targets
| target | |||
![]() |
![]() |
||
|
substructure query |
![]() |
||
![]() |
|||
![]() |
|||
![]() |
|||