Organic chemists often look for molecules that can not be represented by a single structure. Although it is possible to run multiple structure searches in cascade, it is much more efficient to run a search only once using a well designed query structure. This structure often contains query features, possibly including complex conditional expressions for atoms and bonds.
It is possible to define the type of an atom in a custom atom list. If the type of the corresponding atom in the target molecular structure is a member of the list, it is considered a matching atom (Table 1). Not lists can be used to specify atoms to be excluded (Table 2). Please note that the matching of not list atoms may depend on the input format of the query molecule. See details.
Table 1. Atom lists
| target | ||||
![]() |
![]() |
![]() |
||
| query | ![]() |
|||
![]() |
||||
![]() |
||||
Table 2. Atom not lists
| target | ||||
![]() |
![]() |
![]() |
||
| query | ![]() |
|||
![]() |
||||
![]() |
||||
A Any (any atom except hydrogen. Neither matches to explicit nor
implicit hydrogens. Please note that in JChem the SMARTS
primitive "*" is imported as any atom and does not match to
plain hydrogens. (Neither explicit nor implicit.)
For differences between matching any atoms appearing in
different file formats, see here )
AH Any atom, including hydrogen.
Q Hetero (any atom except hydrogen and carbon)
QH Hetero atom or hydrogen (any atom except carbon)
M Metal (contains alkali metals, alkaline earth metals, transition metals, actinides, lanthanides, poor(basic) metals, Ge, Sb and Po)
MH Metal or hydrogen
X Halogen (F,Cl,Br or I)
XH Halogen or hydrogen
Gn Member of group (column) n in the periodic system (n = 1..18)
Attention: G17 is NOT the same as X, as it contains At!
Table 3. Generic query atoms
| target | ||||
![]() |
![]() |
![]() |
||
| query | ![]() |
|||
![]() |
||||
![]() |
||||
![]() |
||||
![]() |
||||
![]() |
||||
a aromatic (has aromatic bond)
A aliphatic (does not have aromatic bond)
D<n> degree (number of explicit connections; default for "n" is one)
H<n> total hydrogens (total number of hydrogen substituents)
h<n> implicit hydrogens (number of implicit hydrogen substituents*)
R<n> rings (number of rings the atom is a member of)
r<n> smallest ring size (size of the smallest ring the atom is a member of)
R ring membership (whether atom is part of a ring or not)
v<n> valence (total bond order)
X<n> connections (number of substituents including hydrogens)
s<n> substitution count (number of non-H substituents)
s0-s5:exact substitution count; s6: 6 or more substitutions
s* substitution as drawn (no extra non-H substituents)
rb<n> ring bond count (number of ring bonds next to the atom)
rb0-rb3:exact ring bond count; rb4: 4 or more ring bonds.
The same property can be achieved using the SMARTS "x" property
(see smarts doc ).
rb* ring bond count as drawn (no extra ring bonds)
u unsaturated atom (atom has double, triple or aromatic bond)
* Corresponds to both ISIS and Daylight behaviours, depending on the
source of the Molecule object. For details, see the
differences section.
Table 4. Atom properties
| target | ||||
![]() |
![]() |
![]() |
||
| query | ||||
![]() |
||||
![]() |
||||
setOption(OPTION_CHARGE_MATCHING, CHARGE_MATCHING_DEFAULT /
CHARGE_MATCHING_EXACT / CHARGE_MATCHING_IGNORE) setOption(OPTION_ISOTOPE_MATCHING, ISOTOPE_MATCHING_DEFAULT /
ISOTOPE_MATCHING_EXACT / ISOTOPE_MATCHING_IGNORE) setOption(OPTION_RADICAL_MATCHING, RADICAL_MATCHING_DEFAULT /
RADICAL_MATCHING_EXACT / RADICAL_MATCHING_IGNORE) chemaxon.sss.SearchConstants.)
The following tables show some examples.
Table 5.
| target | |||
![]() |
![]() |
||
setOption(OPTION_CHARGE_MATCHING,
CHARGE_MATCHING_DEFAULT) (Default) |
|||
| query | |||
setOption(OPTION_CHARGE_MATCHING,
CHARGE_MATCHING_EXACT) |
|||
| query | |||
setOption(OPTION_CHARGE_MATCHING,
CHARGE_MATCHING_IGNORE) |
|||
| query | |||
Table 6.
| target | |||
![]() |
![]() |
||
setOption(OPTION_ISOTOPE_MATCHING,
ISOTOPE_MATCHING_DEFAULT) (Default) |
|||
| query | ![]() |
||
![]() |
|||
setOption(OPTION_ISOTOPE_MATCHING,
ISOTOPE_MATCHING_EXACT) |
|||
| query | ![]() |
||
![]() |
|||
setOption(OPTION_ISOTOPE_MATCHING,
ISOTOPE_MATCHING_IGNORE) |
|||
| query | ![]() |
||
![]() |
|||
Table 7.
| target | |||
![]() |
![]() |
||
setOption(OPTION_RADICAL_MATCHING,
RADICAL_MATCHING_DEFAULT) (Default) |
|||
| query | ![]() |
||
![]() |
|||
setOption(OPTION_RADICAL_MATCHING,
RADICAL_MATCHING_EXACT) |
|||
| query | ![]() |
||
![]() |
|||
setOption(OPTION_RADICAL_MATCHING,
RADICAL_MATCHING_IGNORE) |
|||
| query | ![]() |
||
![]() |
|||
Link nodes are atoms which may occur one or more times defining a variable length chain or ring. The link node is denoted by its brackets and the repetition range. All bonds not crossed by the brackets (and connecting parts) are also repeated together with the link node. See examples below.
Table 8.
| Query | Possible meanings | ||
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
|
JChem's search supports all valid SMARTS atom expressions. (See Daylight's SMARTS theory manual.)
SMARTS atoms are depicted the following way in marvin:

The following additional query features are handled as part of this:
This query feature allows the use of logical operators: the two "and" operators, "or" and "not" to combine queries into complex expressions. Table 9. shows the operators in the order of their precedence ("!" evaluated first):
Table 9.
| Operator | Name |
| ! | not (unary operator) |
| & | high precedence and (default operator, i.e. can be omitted between two query expressions) |
| , | or |
| ; | low precedence and |
Table 10. Examples
| Query | Target | |||
| NCC(O)=O | [N+]CC([O-])=O | [H]OC(=O)N([H])C | COC | |
| [OX2H,OX1-] | ||||
| [O&X2&H,O&X1&-] | ||||
| [NX3;H2,H1] | ||||
| [OX2!-] | ||||
One of the most powerful feature of SMARTS atoms is recursive SMARTS. It can be used to describe an environment of an atom with the syntax "$( <<SMARTS expression>> )". The first atom of the <<SMARTS expression>> will be matched to the atom in question, and the rest to its environments. It evaluates true if the expression matches.
Table 11.
| SMARTS | Meaning |
| [OX2$(OaaN)] | Aliphatic oxygen with two connections, next to an aromatic ring having an aliphatic N in ortho position. |
| [OX2$(*aaN)] | Same as above. |
| [$([OX2]aaN)] | Same as above. |
| [NX3;H2,H1;!$(NC=O)] | Primary or secondary amine, not amide. |
| [$(N~*~*~[O!$(O([C,c])[C,c])])] | Aliphatic N three bonds away to a non-ether aliphatic O. |
Table 12.
| Query | Target | |||
![]() |
![]() |
![]() |
||
| [OX2$(OaaN)] | ||||
| [$(OCC),$(OCN)] | ||||
| [$(O([C,c])[C,c])] | ||||
| [$(N~*~*~[O!$(O([C,c])[C,c])])] | ||||
Please note that uppercase atom symbols only match to aliphatic atoms and lowercase only to aromatic.
In JChem explicit and implicit hydrogens in the target are treated the same, and hence the presence or absence of plain hydrogens does not affect the result of the search.
In JChem the SMARTS primitive "*" (any atom) does not match to plain hydrogens. (Neither explicit nor implicit.) However, it matches deuterium and charged H. See below.
Further SMARTS examples can be found on Daylight's page.
Pseudo atoms have user-defined atom types, and they only match another pseudo atom of the same name (case insensitive). Commonly used pseudo atoms include "Resin" and "Pol", referring to the often used solid phases in syntheses (Pol is the default pseudo for resin in MDL ISIS/Draw).
Table 13.
| Query | Target | |||
![]() |
![]() |
|||
![]() |
||||
![]() |
||||
![]() |
||||
It should be noted that there is no chemical intelligence associated with pseudo atoms. This means that if a common abbreviation is used as pseudo atom, it will not match the corresponding molecular group. To achieve this, correct abbreviations (Superatom S-groups) must be used.
JChem search can handle query and target atoms having lone pairs associated with them. Lone pairs on the query side match explicit and implied lone pairs, but please note that lone pairs are only considered when attached to an atom, ie isolated lone pairs will not match anything.
Table 14.
| Query | Target | |||
![]() |
![]() |
![]() |
||
Querying against bonds can determine if a bond in the target molecule is one of the four basic types (single, double, triple, aromatic)or one of the generic types that are available for fine tuning query structures (Table 6). The line style represents the type of a bond.
| any | |
| single or double | |
| single or aromatic | |
| double or aromatic |
Table 15. Generic bond types
| target | ||
![]() |
||
| query | ||
For the correct use of aromatic bonds and aromatic systems in general, see the Aromatization section under Standardization.
See section Stereochemistry
In addition to the bond type discussed above, a bond topology query attribute can be assigned to bonds. This expresses that the bond must be part of a ring or must not. See the examples below.
Table 16. Generic bond types
| target | |||
![]() |
![]() |
||
| query | |||
SMARTS bond expressions are also supported. (See Daylight's SMARTS theory manual.)
SMARTS bonds are depicted the following way in marvin:

Like at SMARTS atoms, SMARTS logical operators "!" (not), "&", ";" (high and low precedence and), "," (or) can be used. "&" is the default operator, hence "and" is assumed if there is no operator between two SMARTS primitives. Furthermore, the following characters have valid meanings:
Table 17.
| Bond expression | Meaning |
| - | Single bond |
| = | double bond |
| # | triple bond |
| : | aromatic bond |
| @ | any ring bond |
| / | directional bond: single "up" (used at cis/trans) |
| \ | directional bond: single "down" (used at cis/trans) |
Table 18.
| SMARTS | Meaning |
| C-,=,#C | Two aliphatic carbons connected by single, double or triple bond. |
| *-!@* | Two atoms connected by a nonring single bond. |
| *@-,!@&/*=*@-,!@&/* | Double bond between two single bonds in ring or not in ring but in trans configuration. |
Table 19.
| Query | Target | |||
![]() |
![]() |
|||
| C-,=,#C | ||||
| *-!@* | ||||
| *@-,!@&/*=*@-,!@&/* | ||||
Further SMARTS examples can be found on Daylight's page.
Coordination compounds can be registered and searched for in JChem structure databases. Both "atom to atom" and "multicenter" (involving more than two atoms) representations are supported.
Matching of "atom to atom" coordinate bonds is similar to matching other bond types. The direction of the coordinate bond arrow is not checked. See examples below. (Q stands for hetero atom, M for any metal atom. The thin dotted bond represents an ANY query bond.)
Table 20.
| Query | Target | |
![]() |
![]() |
|
Multicenter coordinate bonds are handled the way as if each atom at opposite ends of the coordinate bond had individual coordinate bonds in between them. This means that the following molecule pairs are equivalent (The used molecule representation conforms to IUPAC recommendation: atom to atom coordinate bonds are displayed by an arrow and multicenter coordinate bonds are denoted by thick dashed line.)
![]() |
![]() |
![]() |
![]() |
So individual and multicenter representations can both be used during searching, in all combinations. See examples below. (The thin dotted bonds represent ANY query bond types.)
Table 21.
| Query | Target | |||
![]() |
![]() |
![]() |
![]() |
|
![]() |
||||
Position variation bond (or variable point of attachment) is used to express that a bond may be attached to multiple positions (atoms), most often used for rings. This is represented by a multicenter atom at one or both end of the position variation bond. Its representation and drawing in Marvin is described in the Marvin Sketch help. See examples below.
Table 22. Meaning of position variation.
| Query | Possible meanings | |
![]() |
![]() |
![]() |
Table 23. Matching of position variation queries.
| Query | Target | |||
![]() |
![]() |
![]() |
![]() |
|
![]() |
||||
![]() |
||||
![]() |
||||
This section describes the feautures related to different types of components. Component = set of connected atoms in a molecular drawing. The connection can be:
This feature uses components as atoms connected by bonds. In SMARTS queries it can be specified whether different components (fragments) of the query should appear in the same or different components in the target. It is represented by grouping parentheses around the component in the SMARTS string. Please note that there are no graphical representation of this feature in Marvin.
Table 24.
| SMARTS representation | Meaning |
| C.C | No restrictions. |
| (C.C) | The two carbons must appear in the same component. |
| (C).(C) | The two carbons must appear in different components. |
Table 25.
| Query | Target | |||
|
|
![]() |
|
||
| C.C | ||||
| (C.C) | ||||
| (C).(C) | ||||
| (C).(C).C | ||||
This feature relates to the use of brackets (S-groups) of type COM (component), FOR (formulation) and MIX (mixture). A component here is a set of atoms contained by a component bracket.
An unordered mixture (MIX type S-group) consists of several unordered components. For these types of mixtures, the order of addition during the preparation is not important. Example:

Ordered mixtures (FOR type S-groups), on the other hand contain ordered components, which define the order of addition. Example:

The component grouping of component brackets is considered during the matching, so all atoms drawn inside component brackets in the query can only match atoms that are contained in the same component brackets in the target and separate components can only match separate components.
Component brackets without surrounding mix or for brackets are considered as being in mix (unordered mixture) brackets and molecules not drawn in any component brackets are considered to be in the same component.
Table 26.
| Query | Target | |||
![]() |
![]() |
![]() |
||
Unordered mixture (mix) queries match both unordered (mix) and ordered (for) mixtures. However, ordered (for) mixtures only match ordered mixtures, and the component numbering must keep order. Examples:
Table 27.
| Query | Target | ||
![]() |
![]() |
||
![]() |
|||
![]() |
|||
![]() |
|||
During reaction searching, reaction component grouping is maintained, see at the reaction component handling section.
Exact fragment matching ensures that all query components (atoms connected by bonds) match only full components. See its description in the Search types section.
For the sake of simplicity, organic chemists usually do not draw hydrogen atoms on molecules, but in some models used to represent molecules the hydrogens are shown implicitly or explicitly. Whatever display mode one prefers, all free valences of the atoms are considered to be filled with hydrogens. In case of query structures, explicit query hydrogens have a significant importance. An explicitly drawn query hydrogen defines that the target must contain a hydrogen in that position (Table 8).
Table 28.
| target | ||||
![]() |
![]() |
![]() |
||
| query | ![]() |
|||
![]() |
||||
![]() |
||||
Searches can include extra conditions formulated in the Chemical Terms language. Chemical Terms is a chemistry language which allows users to formulate complex chemical questions, expressions and rules. Chemical Terms can contain references to functional groups, other structural elements and physico-chemical properties. The syntax is described in the Chemical Terms Reference. Search specific functions contained in the search context provide access to the query and the target molecules, the search hit array and its elements:
mol(), target(): both refer to the search target molecule
query(): refers to the search query molecule
m(int i): refers to the query atom index with atom map i
hit(), h(): both refer to the search hit array
hit(int i), h(int i): both refer to the i-th element of the
search hit array, this is the target atom index matching the query atom with
atom index i
hm(int i): refers to the target atom index matching the query atom with
atom map i (shorthand for h(m(i)))
The default input molecule is the target molecule (e.g. mass() is the same as
mass(target()), both refer to the molecule mass of the target molecule).
The filtering expression can be set by
setFilter(filteringExpression)
setFilter(filteringExpression, config)
evaluator.xml.
The following table shows some examples (pKa values are shown at target atoms).
Table 29.
| target | |||
![]() |
![]() |
||
setFilter("pka(hm(1))> 2") |
|||
| query | ![]() |
||
![]() |
|||
setFilter("pka('acidic', hm(1))> 2 && mass()> 100") |
|||
| query | ![]() |
||
![]() |
|||
A set of working examples is also available.