Search options

A few selected search options are described below.

Tautomer search

This search option can instruct the search engine to look for all tautomer forms the query, as generated by the Marvin plugin Tautomers. (For alternative solutions to handle tautomers, see JChem Database Concepts.)

In case of duplicate search type, the tautomer duplicate filtering approach is used.
Remark: duplicate search in a table created with "Duplicate search uses tautomers" option results in a tautomer duplicate search except when "Tautomer search" option is explicitly switched off.

The following restrictions apply in tautomer search mode:

  • The query must not have any query features, and
  • The query must have all implicit hydrogens set for proper perception of the possible tautomers.
  • Double bond stereo settings may be invalid in some tautomeric forms when they refer to a tautomeric region. This kind of stereo information is ignored during tautomer search, but stereo information independent of tautomerizable groups are checked by default.
  • All tetrahedral stereo centers are protected during tautomer search.
  • This search option is best suited to full or full fragment search, as the tautomers of the query are generated for a whole molecule and not for a substructure.

Table 1. Tautomer search examples

Query Target
Tautomer searching off Tautomer searching on

Tautomer duplicate filtering

When this search option is set the search is performed using the generic tautomer form of the query and target. The option is effective only in case of duplicate search. Furthermore, for jchem table or index search operations, "tdf:y" option has to be set for the jchem index or the underlying JChem table. The underlying method is described in detail in the JChem Database Concepts section.

Restrictions: like in the case of tautomer search.

Explicit Hydrogens

From JChem version 5.1.3 explicit plain or isotope hydrogens in tautomerizable groups are treated as relocatable. The explicit Hydrogen constraint is enforced at the same time in the migrated location.

Vague bond search

These search options allow to choose between several levels of strictness in matching bond types, especially regarding aromaticity. The higher the level is, the more tolerant the bond matching becomes. Vague bond options are only used when exactBondMatching is off. Otherwise (e.g. for DUPLICATE search type), vague bond level 0(off) is used.

To fully exploit vague bond functionality, it is best to use search objects that do aromatization inside the search object: JChemSearch and StandardizedMolSearch.

Table 2. summarizes the vague bond levels, and the following sections describes them in detail.

Table 2.

Vague bond level Description
Level 0 (off) Does not perform vague bond matching.
Level 1 (default) Handling of 5-membered rings with ambiguous aromaticity,
1-atom-long aromatic ring ligands and bridging bonds between two aromatic rings.
Level 2 All query ring bonds, 1-atom-long aromatic ring ligands
and bridging bonds between two aromatic rings become ″or aromatic″
Level 3 All query bonds (ring and chain) become ″or aromatic″
Level 4 Ignore all bond types

Methods used in vague bond search

5-membered rings with ambiguous aromaticity

Handles some commonly occurring 5-membered query ring patterns formulated in Kekule format that have ambiguous aromaticity. This way it can return hits "visually expected by chemists", although strict bond matching would not return these. A few such ambiguous ring substructures are depicted below, with their corresponding aromatic and nonaromatic superstructures.

Table 3.

Ambiguous substructure Aromatic example Nonaromatic example
=
=
=
=

This method (used by default from JChem 3.2) ensures the expected matching of all queries where these substructures appear. On the other hand, when these rings are not not handled, query would match only the aromatic or the aliphatic targets, depending on the ambiguous query ring. (See examples.)

For efficiency reasons, above 5 such 5-membered ring patterns in the query, these ambiguous ring patterns work the same way as all ring bonds described in level 2 below.

Table 4. shows the difference between handling and not handling ambiguous rings.

Table 4.

Query Target
Not handled Handled
ambiguous aromatic rings
= = = =

1-atom-long aromatic ring ligands

Used by default from JChem 5.3. Those single or 'single or double' bonds that are connected to an aromatic ring are allowed to match to an aromatic bond, except if

  • there is another ligand on the same ring atom, or
  • the bond continues in a longer (more than 1 bond long) chain

Remark: when the bond is connected to an ambiguous 5-membered ring, it can match to an aromatic bond only if the ring is evaluated to aromatic.

Table 5.

Query Target
= = = = =

Bridging bonds between two aromatic rings

Used by default from JChem 5.3. Single bonds connecting two aromatic rings are allowed to match to an aromatic bond. See also the remark about ambiguous rings at the previous method.

Table 6.

Query Target
= = = =

Generalizing bond matching

Generalizes bond matching so that a bond can also match aromatic, or can totally ignore query bond types.

Vague bond search levels

Level 0 (vague bond matching off)

This corresponds to the behaviour before JChem 3.2.

This method must be used if you would like to make distinction between different resonant structures, and you are passing molecules in Kekule(unaromatized) format into the search object. (MolSearch class only).

Level 1 (default)

Applied methods:

Higher levels (vague bond levels 2-4)

The higher level vague bond options are convenience options. Their effect can be achieved also by using appropriate query bond types in the query. These options should be used during database searching carefully, because they make fingerprint screening inefficient.

They have the following effect on the query bond types:

    Level 2
    Generalizes all ring bond types to also match aromatic.
        Also applies '1-atom-long aromatic ring ligands' and
        'Bridging bonds between two aromatic rings' methods
        (since all ring bonds can match aromatic, all rings are considered aromatic)
    Level 3
    Generalizes all bond types to also match aromatic.
    Level 4
    Ignores all bond types.

Table 7. describes what bond type transformations are performed on the query before the search:

Table 7.

Original bond type in query Vague bond level
2 (Ring bonds + special ligands

and bridging bonds)

3 (All bonds) 4 (All bonds)
(S) (S/A) (S/A) (A)
(D) (D/A) (D/A) (A)
(T) (A) (A) (A)
(Ar) (Ar) (Ar) (A)
(S/D) (A) (A) (A)
(S/A) (S/A) (S/A) (A)
(D/A) (D/A) (D/A) (A)
(A) (A) (A) (A)

Abbreviations in Table 7.: S - single; D - double; T - triple; Ar - aromatic; S/D - single or double; S/A - single or aromatic; D/A - double or aromatic; A - any.

Table 8.

Query Target
= =
Vague bond level Vague bond level
0(off) 2 3 4 0(off) 2 3 4

Checking sp-hybridization state

The sp-hybridization option specifies if the sp-hybridization state of the atoms should be considered.

Calculation of the sp-hybridization state

The following states are considered:
  • sp - line configuration (e.g. C in CO2)
  • sp2 - planar configuration (e.g. C atoms in benzene)
  • sp3 - tetrahedral configuration (e.g. C in methane)
The sp hybridization state of hetero atoms is also defined by counting their lone electron pairs.
This calculated sp-hybridization state reflects the spatial configuration of the C, N and O atoms rather than the sp-hybridization of the orbitals. It doesn't cover all the mixed orbitals of Si, S and P etc. atoms.
The rules for defining the sp-hybridization state of an atom can be seen on Table 9.

Table 9. Calculation rules

Hybrdization state Conditions
(OR relation)
unknown
  • query bonds
  • >2 double bonds
  • >1 triple bonds
  • both double and triple bonds
s
  • hydrogen
  • helium
sp
  • two double bonds
  • one triple bond
sp2
  • one double bonds
  • aromatic bonds exist
sp3
  • heavy atom having only single bonds

If checking is required, in some cases we obtain less hits than without sp-hybridization checking, because the formerly matching atoms have different sp-hybridization state.

Examples for searching with sp-hybridization checking

Table 10.

Query Target
Sp-hybridization checking
ON OFF ON OFF

Sp-hybridization checking may be used together with vague-bond level 4. In this case all bonds in the query match all kinds of target bonds. Using these two options molecules having atoms with the same sp-hybridization state are retrieved regardless of their bond type.

Table 11. Results with vague-bond level 4, ignoring all bond types.

Query Target
Sp-hybridization checking
ON OFF ON OFF ON OFF

Examples for searching with different implicit H matching modes

For these examples the search type is set to duplicate.

Table 12. Results with different implicit H matching modes, duplicate search.

Query Target Implicit H matching
Enabled Disabled Ignore Ignore and
Isotope matching switched off

Table 13. Charge matching mode ignore forces implicit H maching mode ignore in case of duplicate search.

Query Target
Charge matching
Ignore

Summary of search options

In this section, all search options are summarized and their usage is shown in different search interfaces.

Detailed description of all search options are available from the JChem Search Option Guide.

Usage of options

  1. MolSearch API: Options are set through a MolSearchOptions object:
            MolSearch ms;
            // ...
            MolSearchOptions mso = ms.getSearchOptions();
            mso.setImplicitHMatching(IMPLICIT_H_MATCHING_ENABLED)  // Setting search options
            // ...
    
  2. JChemSearch API: Options are set through a JChemSearchOptions object:
            JChemSearch searcher = new JChemSearch();
            // ...
            JChemSearchOptions searchOptions = new JChemSearchOptions();
            searchOptions.setReturnsNonHits(true) // Setting search options
            // ...
            searcher.setSearchOptions(searchOptions);
    
    Remark: getSearchOptions() and setSearchOptions(...) methods are available in both MolSearch and JChemSearch objects.

  3. Cartridge jc_compare operator: given as third parameter ('options')
  4. jcsearch command-line tool: command-line parameters starting with - or --.
Feature name Option specification in
MolSearch JChemSearch JChem Cartridge jcsearch
Absolute stereo (ignore chiral flag) setQueryAbsoluteStereo(true/false),
setTargetAbsoluteStereo(true/false)
setAbsoluteStereo(ABS_STEREO_TABLE_OPTION /
ABS_STEREO_CHIRAL_FLAG / ABS_STEREO_ALWAYS_ON )
absoluteStereo:t/c/a --queryAbsoluteStereo:y/n,
--targetAbsoluteStereo:y/n,
--DBAbsoluteStereo:T/C/A
Attached data matching behaviour setAttachedDataMatch (ATTACHED_DATA_MATCH_IGNORE / ATTACHED_DATA_MATCH_GENERAL / ATTACHED_DATA_MATCH_EXACT) attachedDataMatch:i/g/e --attachedDataMatch:i/g/e
Attached data prefixes to check setAttachedDataPrefixes (comma separated values of prefixes) attachedDataPrefixes:comma separated values of prefixes --attachedDataPrefixes: comma separated values of prefixes
Charge matching mode setChargeMatching( CHARGE_MATCHING_DEFAULT / CHARGE_MATCHING_EXACT / CHARGE_MATCHING_IGNORE) exactChargeMatchingOption:d/e/i --charge:d/e/i
Chemical Terms filter expression setFilter(chemical-terms-expression) setChemTermsFilter(chemical-terms-expression) ctFilter:chemical-terms-expression -e "expression" / file name
or --expression
"expression" / file name
Copolymer matching setCopolymerMatching( true/false ) copolymerMatching::y/n --copolymerMatching:y/n
Dissimilarity metrics (only in DB search) setDissimilarityMetrics( "tanimoto" / "tversky" / "dice" / "euclidean" / "normalized_euclidean" / "substructure" / "superstructure" dissimilarityMetric -
Distinct first atom matching: enable/disable setDistinctFirstAtomMatching(false/true) - - --distinctFirstAtomMatching:n/y
Double bond stereo: no check / marked / all double bonds setDoubleBondStereoMatchingMode( DBS_NONE / DBS_MARKED / DBS_ALL) doubleBondStereo:N/M/A --doubleBondStereo:N/M/A
Enumeration of all query-target mappings findAll() or
findFirst() and consecutive findNext() calls
- - --allHits
Exact bond matching setExactBondMatching(true/false) - -
Exact special atom matching setExactSpecialAtomMatching(String) exactSpecialAtomMatching: comma separated identifiers of special atoms --exactSpecialAtomMatching: comma separated identifiers of special atoms
Exact stereo matching setExactStereoMatching(true/false) exactStereoMatching:y/n --exactStereoSearch:y/n
Exact query atom matching setExactQueryAtomMatching(true/false) exactQueryAtomMatching:y/n --exactQueryAtomMatching:y/n
Homology narrow translation setHomologyNarrowTranslation( chemaxon.sss.search.options.HomologyTranslationOption ) homologyNarrowTranslation:n/a/m --homologyNarrowTranslation:n/a/m
Homology broad translation setHomologyBroadTranslation( chemaxon.sss.search.options.HomologyTranslationOption ) homologyBroadTranslation:n/a/m --homologyBroadTranslation:n/a/m
Hydrogen count query property interpretation setHCountMatching( HCOUNT_MATCHING_AUTO / HCOUNT_MATCHING_EQUAL / HCOUNT_MATCHING_GREATER_OR_EQUAL ) HCountMatching:G/E/A --HCountMatching:G/E/A
Ignore allene stereo setIgnoreAlleneStereo( true/false) - --ignoreAlleneStereo:y/n
Ignore axial stereo setIgnoreAxialStereo( true/false) - --ignoreAxialStereo:y/n
Ignore Chemical Terms evaluation errors setIgnoreCTExceptions( true/false ) ignoreCTExceptions:y/n --ignoreCTExceptions:y/n
Ignore double bond stereo setIgnoreDoubleBondStereo( true/false) - --ignoreDoubleBondStereo:y/n
Ignore mixture brackets setMixSgroupMatching( MIX_SGROUP_MATCHING_ON, MIX_SGROUP_MATCHING_OFF ) ignoreMixtureBrackets:y/n --mix:d/i
Ignore polymer brackets setPolymerMatching( true/false ) polymer:d/i --polymer:d/i
Ignore syn-anti stereo setIgnoreSynAntiStereo( true/false) - --ignoreSynAntiStereo:y/n
Ignore tetrahedral stereo setIgnoreTetrahedralStereo( true/false) - --ignoreTetrahedralStereo:y/n
Implicit H matching mode setImplicitHMatching( IMPLICIT_H_MATCHING_DEFAULT /
IMPLICIT_H_MATCHING_ENABLED /
IMPLICIT_H_MATCHING_DISABLED /
IMPLICIT_H_MATCHING_IGNORE)
implicitHMatching:d/y/n/i --implicitHMatching:d/y/n/i
Inverse hit list - setReturnsNonHits(true/false) jc_compare(...) = 0 -n
Isotope matching mode setIsotopeMatching( ISOTOPE_MATCHING_DEFAULT / ISOTOPE_MATCHING_EXACT /
ISOTOPE_MATCHING_IGNORE)
isotope:d/e/i --isotope:d/e/i
Markush search enable/disable setMarkushEnabled(false/true) (cannot be set, depends on database table type) - --markush:n/y
Maximum internal search steps setTimeoutLimit(int) - - -
Maximum number of hits - setMaxResultCount(int) maxHitCount:int --maxResults:int
Maximum search time - setMaxTime(long) maxTime:int -
Multiple queries (in "and" relation) - - - --and
Multiple queries (in "or" relation) - - concatenated queries --or
Optimize query atom order(for performance) setKeepQueryOrder(false/true) - --keepQueryOrder
Optimize queries containing special query features (atom lists, bond lists, ...) - setOptimizeQueries(false/true) - --optimizeQueries:y/n
Order sensitive hits setOrderSensitiveSearch(true/false) - - --orderSensitive
Ordering of results - setOrder( NO_ORDERING / ORDERING_BY_ID /
ORDERING_BY_ID_OR_SIMILARITY )
- -
Polymer end group matching setEndgroupMatching( true/false ) endGroupMatching:y/n --endGroupMatching:y/n
Polymer phase-shift setPhaseShiftedMatching( true/false ) phaseShift:y/n --phaseShift:y/n
Pre-assignment of query and target atoms addMatch(int, int)
addMatch(int[], int[], int)
- - -
Radical matching mode setRadicalMatching( RADICAL_MATCHING_DEFAULT / RADICAL_MATCHING_EXACT / RADICAL_MATCHING_IGNORE) radical:d/e/i --radical:d/e/i
Reaction search handling of unpaired maps setReactionUnpairedMapMatching( REACTION_UNPAIRED_MAP_MATCHES_ALL / REACTION_UNPAIRED_MATCHES_UNPAIRED_ONLY) - --reactionUnpairedMap:
all/unpairedOnly
Result table - setResultTable(String) - -
Search type (substructure, full, full fragment, duplicate, superstructure, etc.) setSearchType(SUBSTRUCTURE / SUPERSTRUCTURE / FULL / FULL_FRAGMENT / DUPLICATE ) t:s/f/ff/d/t/r -t:s/f/ff/d/i/u/c
(dis)Similarity threshold for similarity search - setDissimilarityThreshold(float) simThreshold:float -t:i:float
Sp-hybridization state checking setCheckSpHyb(true/false) checkSpHyb:Y/N --checkSpHyb:y/n
Standardization configuration setStandardizer(standardizer, boolean, boolean) (it can be set as a table property) -S / --standardize "file name" / "action string"
Stereo on/off setStereoSearch(true/false) stereoSearch:Y/N --stereoSearch:y/n
Stereo search type setStereoSearchType (STEREO_SPECIFIC / STEREO_IGNORE / STEREO_EXACT / STEREO_DIASTEREOMER / STEREO_ENANTIOMER) stereoSearchType:s/i/e/d/a --stereoSearchType:s/i/e/d/a
Stereo model setStereoModel( STEREO_MODEL_LOCAL / STEREO_MODEL_COMPREHENSIVE / STEREO_MODEL_GLOBAL ) - --stereoModel:l/g/c
SQL SELECT statement for pre-filtering - setFilterQuery(String) filterQuery:select-statement -
Tautomer search setTautomerSearch( TAUTOMER_SEARCH_DEFAULT / TAUTOMER_SEARCH_ON / TAUTOMER_SEARCH_OFF ) tautomerSearch:d/y/n --tautomerSearch:d/y/n
Tautomer duplicate filtering setTautomerDuplicateFiltering( true/false ) tdf:y/n --tdf
Transform monomer representations setMonomerTransform( true/false ) transformMonomer::y/n --transformMonomer:y/n
Undefined R-atom matching mode setUndefinedRAtom( UNDEF_R_MATCHING_GROUP / UNDEF_R_MATCHING_GROUP_H / UNDEF_R_MATCHING_GROUP_H_EMPTY / UNDEF_R_MATCHING_ALL / UNDEF_R_MATCHING_UNDEF_R ) undefinedRAtom:g/gh/ghe/a/u --undefinedRAtom:g/gh/ghe/a/u
Hit ordering type setHitOrdering( HIT_ORDERING_NONE / HIT_ORDERING_UNDEF_R_MATCHING_GROUP_FIRST ) - --hitOrdering:n/g
Vague bond search levels setVagueBondLevel( VAGUE_BOND_OFF / VAGUE_BOND_LEVEL1 / VAGUE_BOND_LEVEL2 / VAGUE_BOND_LEVEL3 / VAGUE_BOND_LEVEL4 ) vagueBond:n/1/2/3/4 --vagueBond:n/1/2/3/4
Valence matching mode setValenceMatching( VALENCE_MATCHING_ON / VALENCE_MATCHING_IGNORE ) - --valence:d/i
(The used constants are defined in class chemaxon.sss.SearchConstants)

Back to index page

Do you have a question? Would you like to learn more? Please browse among the related topics on our support forum or search the website. If you want to suggest modifications or improvements to our documentation email our support directly!