To assist searching structures in a database, JChem provides the
chemaxon.jchem.db.JChemSearch
JavaBean. The following
search types
are supported:
ConnectionHandler)
passed by the caller method,
JChemSearch
retrieves all structures that
match the search criteria from the given structure table
and returns their cd_id
values in an int array.
Oracle users may also use JChem Cartridge for Oracle to perform search and other operations via SQL commands.
Comments:
It is recommended to apply MarvinSketch as a tool for drawing query structures.
Steps of creating a web page for entering query structures:
MSketch.getMol(...).
For example, if the name of the hidden input variable is "molfile"
and you need the structure in Marvin format use this call
in JavaScript:
form.molfile.value=document.MSketch.getMol('mrv');
(You can also get the structure in other formats, like MDL's Molfile or SMILES,
but the Marvin format is recommended as it can represent all molecule and
query features that are available in Marvin Sketch. You can find more
information about file formats
here.)
JChemSearch
class
After
creating a JChemSearch object,
setting the following properties is necessary:
queryStructure |
the query structure in Smiles, Molfile, or other format | ||||||||||||
connectionHandler |
specifies the connection | ||||||||||||
structureTable |
the table in the database where the structures are stored | ||||||||||||
searchType |
specifies the type of the search. Values:
|
API of the JChemSearch class.
Many of these options are detailed in the
Substructure Search
section.
Example:
JChemSearchOptions jcSearchOptions = new JChemSearchOptions();
jcSearchOptions.setSearchType(SearchConstants.SUBSTRUCTURE);
JChemSearch jChemSearch = new JChemSearch();
jChemSearch.setStructureTable(structTableName);
jChemSearch.setQueryStructure("Brc1ccccc1");
jChemSearch.setSearchOptions(jcSearchOptions);
jChemSearch.setConnectionHandler(connectionHandler);
jChemSearch.run();
Please, see the InitializingSearch.java
example in the examples/java/search directory.
This search type can be used to retrieve the same molecule as the query. It is used to check whether a chemical structure already exists in the database, and also during duplicate filter import. All structural features (atom types, isotopes, stereochemistry, query features, etc.) must be the same for matching, but for example coordinates and dimensionality are usually ignored.
For this search mode there is no search per minute license limitation in JChemBase, these searches are not counted.
Java example: Throwing an exception if a given structure exists.
... // Initialize connection
String mol = "Clc1cccc(Br)c1"; // Query in SMILES/SMARTS,
// MDL Molfile or other format
String structureTableName = "cduser.structures";
JChemSearch searcher = new JChemSearch(); // Create searcher object
searcher.setQueryStructure(mol);
searcher.setConnectionHandler(conHandler);
searcher.setStructureTable(structureTableName);
searcher.setRunMode(JChemSearch.RUN_MODE_SYNCH_COMPLETE);
JChemSearcOptions searchOptions = new JChemSearchOptions();
searchOptions.setSearchType(JChemSearch.DUPLICATE);
searcher.setSearchOptions(searchOptions);
searcher.run();
if(searcher.getResultCount()>0)
throw new Exception("Structure already exists (cd_id=" +
searcher.getResult(0) + ")");
...
More examples:
<JChem's home>/examples/db_search/update.jsp
<JChem's home>\examples\asp\update.asp
Substructure search finds all structures that contain the query structure as a subgraph. Sometimes not only the chemical subgraph is provided, but certain query features also that further restrict the structure. If special molecular features are present on the query (eg. stereochemistry, charge, etc.), only those targets match which also contain the feature. However, if a feature is missing from the query, it is not required to be missing (by default). For more information, see the JChem Query Guide.
Searching starts with a fast screening phase where query and database fingerprints are compared. If the result of the screening is positive (meaning that a fit is possible) for a database structure, then an atom-by-atom search (ABAS) is also performed. Query structures may contain query atoms and bonds described earlier.
The initialization of substructure searching is similar to
duplicate searching, but
the
searchType property of
JChemSearch
should be set to JChemSearch.SUBSTRUCTURE.
Java example:
... // Initialize connection
String mol = "[*]c1cccc([Cl,Br])c1"; // Query structure
String structureTableName = "cduser.structures";
JChemSearch searcher = new JChemSearch(); // Create searcher object
searcher.setQueryStructure(mol);
searcher.setConnectionHandler(conHandler);
searcher.setStructureTable(structureTableName);
JChemSearcOptions searchOptions = new JChemSearchOptions();
searchOptions.setSearchType(JChemSearch.SUBSTRUCTURE);
searcher.setSearchOptions(searchOptions);
searcher.run();
...
More examples:
<JChem's home>/examples/db_search/searching.jsp
<JChem's home>\examples\asp\searching.asp
Since substructure searching can be time consuming, it is reasonable
to create a new thread for the search.
If the
runMode
property of JChemSearch is set to
JChemSearch.RUN_MODE_ASYNCH_COMPLETE, then
searching runs in a separate thread.
The progress of the search
can be checked by the following properties of
JChemSearch:
running |
checks if searching is still running |
progressMessage |
textual information about the phase of the search process |
resultCount |
the number of hits found so far |
currentId |
the cd_id value of the
molecule being checked |
Java application example:
searcher.setRunMode(JChemSearch.RUN_MODE_ASYNCH_COMPLETE);
searcher.run();
while(searcher.isRunning()) {
String msg = searcher.getProgressMessage();
int count = searcher.getResultCount();
int lastId = searcher.getCurrentId();
... // Displaying
Thread.sleep(2000);
}
Please, see the SeparateSearchThread.java
example in the examples/java/search directory.
JSP example:
...
if(searcher.isRunning()) {
%>
<p>Please wait. Searching.....
<p><%= searcher.getProgressMessage() >%
<p>Hits: <%= searcher.getResultCount() %>
<p>
<%
if(searcher.getCurrentId()>0) {
%>
Current id: <%= searcher.getCurrentId() %>
<%
}
...
%>
<script LANGUAGE="JavaScript">
<!--
window.setTimeout('window.location.reload(false)', 5000);
//-->
</script>
>%
}
...
More examples:
<JChem's home>/examples/db_search/searching.jsp
<JChem's home>\examples\asp\searching.asp
If the
resultTableMode
property of
JChemSearch
is set to
JChemSearch.NO_RESULT_TABLE,
then the following properties can be used
for retrieving the results:
resultCount |
the number of hits found |
maxTimeReached |
returns true if the search stopped because the time that passed since the start of the searched had reached the maximum value |
maxResultCountReached |
returns true if the search stopped because the number of hits had reached the maximum value |
result |
returns the cd_id value of a found compound
specified by an index value.
|
exception,
error,errorMessage
|
if an error occurred during the search these properties provide information about the problem |
The two ways of retrieving the results of the search are:
The process of retrieving the results of the search from the JChemSearchObject:
getHitsAsMolecules(...)
method on the same JChemSearch object used to run the search.cd_id
values obtained by a
JChemSearch.getResults() or
JChemSearch.getResult(int) call.
coloring |
Specifies if substructure hit coloring should be used. |
enumerateMarkush |
Specifies if markush structures should be hit enumerated according to the query structure. |
alignmentMode |
Specifies what form of alignment to use for hit display. The following values are accepted: ALIGNMENT_OFF (default), ALIGNMENT_ROTATE, ALIGNMENT_PARTIAL_CLEAN |
Object array elements, one for each molecule.
Inside each Object array will be the fetched data field values. Molecule array return value.
Java example:
JChemSearch searcher = new JChemSearch(); // Create searcher object
...
int cd_ids[] = new int[cells]; //cd_id value
int cellIndex=0;
for(int i=0; i < cells; i++) {
cd_ids[i] = searcher.getResult(start + i);
}
HitColoringAndAlignmentOptions options= new HitColoringAndAlignmentOptions();
options.coloring = !noHitColoring
&& (substructureColoring || descriptorColoring);
options.enumerateMarkush = enumerateMarkush;
options.alignmentMode = HitColoringAndAlignmentOptions.ALIGNMENT_OFF;
if (!noAlignment) {
if (hitAlignment)
options.alignmentMode = HitColoringAndAlignmentOptions.ALIGNMENT_ROTATE;
else if (partialClean)
options.alignmentMode = HitColoringAndAlignmentOptions.ALIGNMENT_PARTIAL_CLEAN;
}
ArrayList dataFieldNames = new ArrayList();
dataFieldNames.add("cd_id");
dataFieldNames.add("cd_formula");
dataFieldNames.add("cd_molweight");
ArrayList dataFieldValues = new ArrayList();
Molecule[] mols = searcher.getHitsAsMolecules(cd_ids, options,
dataFieldNames, dataFieldValues);
...
More examples:
<JChem's home>/examples/db_search/searchresults.jsp
The process of retrieving the results of the search from the ResultSet Object:
SELECT cd_structure, ...
FROM cduser.structures
WHERE cd_id=12532
cd_id
value obtained by a
JChemSearch.getResult(...) call
in the condition of the SQL statement
cd_structure
obtained from the ResultSet.
marvin.js
to make sure that the pages can be viewed in different browsers,
GUI-s (AWT/Swing), and JVM-s (built-in/Java plugin).
cd_structure
column using HTMLTools.convertForJavaScript(...)
for inserting into a web page in an applet parameter.
Java example:
int[] cdIds = jChemSearch.getResults();
String retrieverSql =
"SELECT cd_molweight from " + structTableName
+ " where cd_id = ?";
PreparedStatement ps =
connectionHandler.getConnection().prepareStatement(
retrieverSql);
try {
for (int i = 0; i < cdIds.length; i++) {
int cdId = cdIds[i];
ps.setInt(1, cdId);
ResultSet rs = ps.executeQuery();
if (rs.next()) {
System.out.println("Mass: " + rs.getDouble(1));
} else {
; // has been deleted in the meantime?
}
}
} finally {
ps.close();
}
Please, see the RetrievingResults.java
example in the examples/java/search directory.
JSP example for displaying search results on a web page by a JSP script:
...
Hits: <%= searcher.getResultCount() %>
<font size="1">
<%= searcher.isMaxResultCountReached()? " (maximum hits reached)" : "" %>
<%= searcher.isMaxTimeReached()? " (maximum time reached)" : "" %>
</font>
<center>
<script LANGUAGE="JavaScript1.1" SRC="../../../marvin/marvin.js">
</script>
<script LANGUAGE="JavaScript1.1">
<!--
mview_name="mview";
mview_begin("../../../marvin",
"<%= cols*cellWidth+cols-1 %>",
"<%= rows*cellHeight+rows-1 %>");
mview_param("rows", "<%= rows %>");
mview_param("cols", "<%= cols %>");
mview_param("navmode", "rot3d");
mview_param("molbg", "#000000");
mview_param("bgcolor", "#e0e0e0");
mview_param("border", "1");
mview_param("animate", "0");
mview_param("layout0", ":4:1:"+
"L:0:0:1:1:w:n:0:10:"+ <%/* ID */%>
"L:1:0:1:1:w:n:0:10:"+ <%/* Stock */%>
"M:2:0:1:1:c:n:1:10"); <%/* Molecule */%>
mview_param("param0", ":"+
"L:11b:"+
"L:10:"+
"M:<%= structureWidth %>:<%= structureHeight %>");
<%
/*
* Writing data into the cells in the applet
*/
int cellIndex=0;
for(int i=start; i<start+cells; i++) {
int id = searcher.getResult(i);
/*
* SQL statement for retrieving structures
*/
String sql =
"SELECT " +
structureTableName + ".cd_id, "+
structureTableName + ".cd_structure, " +
stockTableName + ".quantity\n" +
"FROM "+ structureTableName + ", " +
stockTableName + "\n" +
"WHERE " +
structureTableName + ".cd_id = " +
stockTableName + ".cd_id AND " +
structureTableName + ".cd_id = " + id;
Statement stmt = con.createStatement();
ResultSet rs = stmt.executeQuery(sql);
try {
if(rs.next()) {
String dbMolfile = new String(DatabaseTools.readBytes(rs, 2),"ASCII");
float dbQuantity = rs.getFloat(3);
%>
mview_param("cell<%= cellIndex %>",
"|ID: <%= id %>"+
"|<%= Math.round(dbQuantity) %> mg"+
"|<%= HTMLTools.convertForJavaScript(
dbMolfile) %>");
<%
}
} finally {
rs.close();
stmt.close();
}
cellIndex++;
}
%>
mview_end();
//-->
</script>
</center>
...
Click here to see the live result of the above code.
More JSP examples:
<JChem's home>/examples/db_search/searchresults.jsp<JChem's home>/examples/simple_db_search/searchresults.jspTo store the results in a table, the
name of the table should be specified by the
resultTable
property of
JChemSearch, and also the
resultTableMode
property
should be set to either
JChemSearch.CREATE_OR_REPLACE_RESULT_TABLE or
JChemSearch.APPEND_TO_RESULT_TABLE.
runMode
property of JChemSearch is set to
JChemSearch.RUN_MODE_ASYNCH_PROGRESSIVE, then
searching runs in a separate thread and hits can be retrieved as soon as they
are found. Note that this mode does not support any ordering:
ArrayList hitsByPages = new ArrayList();
int[] nextPage = new int[NR_OF_HITS_PER_PAGE];
int idxForNextPage = 0;
searcher.setOrder(JChemSearch.NO_ORDERING);
searcher.setRunMode(JChemSearch.RUN_MODE_ASYNCH_PROGRESSIVE);
while(searcher.hasMoreHits()) {
nextPage[idxForNextPage++] = searcher.getNextHit();
if (idxForNextPage == NR_OF_HITS_PER_PAGE) {
synchronized (hitsByPages) {
hitsByPages.add(nextPage);
hitsByPages.notifyAll(); // notify any who may be in wait for
// the next page
nextPage = new int[NR_OF_HITS_PER_PAGE];
idxForNextPage = 0;
}
}
}
// hits for the last page if any
if (idxForNextPage > 0) {
int[] lastPage = new int[idxForNextPage];
System.arrayCopy(nextPage, 0, lastPage, 0, idxForNextPage - 1);
synchronized (hitsByPages) {
hitsByPages.add(lastPage);
hitsByPages.notifyAll(); // notify any who may be in wait for
// the next page
}
}
To boost the speed of substructure searching, JChem caches fingerprints and structures in the searcher application's memory. For more information, see the JChem database concepts section.
Many times structure information is only one of several conditions that a complex query has to check. In those cases structure searches should be combined with SQL queries.
Example: Suppose that quantities on stock are stored in a table different from the structure table. We are querying compounds that contain a given substructure and their quantity on stock is not less than a given value.
Two ways of performing the combined query:
JChemSearch object.
JChemSearch to save the
cd_id values of found compounds:
set the name of the result table using the
setResultsTable method.
SELECT cd_structure, quantity FROM hits, structures, stock
WHERE hits.cd_id=structures.cd_id AND stock.cd_id=structures.cd_id
An arbitrary SQL query can be specified as a filter for the
filterQuery property. The (first) result column should
should contain the allowed cd_id values.
Java example:
JChemSearchOptions jcSearchOptions = new JChemSearchOptions();
jcSearchOptions.setSearchType(SearchConstants.SUBSTRUCTURE);
jcSearchOptions.setFilterQuery("select cd_id from " + stockTableName
+ " where quantity < 2");
Please, see the SearchCombinedWithSqlQuery.java
example in the examples/java/search directory.
JChemSearch, like
maxResultCount
|
The maximum number of molecules that can be found by the search. |
maxTime
|
The maximum amount of time in milliseconds, which is available for searching. |
stringToAppend
|
A string (like an ORDER BY sub-expression) to be appended to the SQL expression used for screening and retrieving rows from the structure table. |
infoToStdError
|
If set to true, information useful for testing will be written in the servlet server's error log file. |
order
|
Specifies the order of the result.
Java example:
JChemSearchOptions jcSearchOptions = new JChemSearchOptions();
jcSearchOptions.setSearchType(SearchConstants.SIMILARITY);
jcSearchOptions.setDissimilarityThreshold((float) 0.6);
JChemSearch jChemSearch = new JChemSearch();
jChemSearch.setStructureTable(structTableName);
jChemSearch.setQueryStructure("c1ccccc1N");
jChemSearch.setSearchOptions(jcSearchOptions);
jChemSearch.setConnectionHandler(connectionHandler);
// Change the default which is by similarity and id:
jChemSearch.setOrder(JChemSearch.ORDERING_BY_ID);
Please, see the HitsInSpecificOrder.java
example in the examples/java/search directory.
|
Superstructure search finds all molecules where the query is superstructure
of the target. It can be invoked in a similar fashion as
Substructure search.
Set search type to JChemSearch.SUPERSTRUCTURE.
A full structure search finds molecules that are equal (in size) to the query structure. (No additional fragments or heavy atoms are allowed.) Molecular features (by default) are evaluated the same way as described above for substructure search.
For this search type,
the
searchType property of
JChemSearch
should be set to JChemSearch.FULL.
Full fragment search is between substructure and full search: the query must fully match to a whole fragment of the target. Other fragments may be present in the target, they are ignored. This search type is useful to perform a "Full structure search" that ignores salts or solvents beside the main structure in the target.
For this search type,
the
searchType property of
JChemSearch
should be set to JChemSearch.FULL_FRAGMENT.
Similarity searching finds molecules that are similar to the query structure. Per default the search uses Tanimoto coefficient. Tanimoto coefficient has two arguments:
| NA&B | ||
| 1 | - | |
| NA+NB-NA&B |
Other dissimilarity metrics can be set by the
setDissimilarityMetric function. Possible values:
...
JChemSearchOptions jcso = new JChemSearchOptions();
jcso.setDissimilarityMetric("Tversky,0.3,0.6");
...
The dissimilarity threshold is a number between 0 and 1, which specifies a cutoff limit in the similarity calculation. If the dissimilarity value is less than the threshold, then the query structure and the given database structure are considered similar.
See more details on fingerprints in the section Parameters for Generating Chemical Hashed Fingerprints
Similarity searching should be used the same way as
substructure searching. To enable similarity searching,
the
searchType property of
JChemSearch
should be set to JChemSearch.SIMILARITY.
If the
order
property is set to
DatabaseSearch.ORDERING_BY_ID_OR_SIMILARITY (which is the
default), then the hits returned by the
getResult() method will be sorted in increasing order of
dissimilarity.
The dissimilarity threshold is set on
JChemSearchOptions with this function:
setDissimilarityThreshold(float dissimilarityThreshold)
|
Sets the dissimilarity threshold. Expects a float value between 0 and 1. A lower threshold results in hits that are more similar to the query structure. |
The dissimilarity values predicted in the similarity calculation are retrieved with the
JChemSearch instance with this function:
getDissimilarity(int index)
|
Returns the predicted dissimilarity value for the hit corresponding to the given index. |
Java example:
...
JChemSearch searcher = new JChemSearch(); // Create searcher object
searcher.setQueryStructure(mol);
searcher.setConnectionHandler(conHandler);
searcher.setStructureTable(structureTableName);
JChemSearcOptions searchOptions = new JChemSearchOptions();
searchOptions.setSearchType(JChemSearch.SIMILARITY);
searchOptions.setDissimilarityThreshold(0.2);
searcher.setSearchOptions(searchOptions);
searcher.run();
...
for(int i=0; i<searcher.getResultCount(); i++) {
float similarity = searcher.getDissimilarity(i);
...
}
...
More examples:
<JChem's home>/examples/db_search/searching.jsp
<JChem's home>\examples\asp\searching.asp
If a result table is generated
during a similarity search,
then the table will contain both the cd_id and
the calculated similarity values.
Users can open up new ways of similarity searching by using a number of built-in molecular descriptor types other than the default chemical hashed fingerprints. There are a number of built-in molecular descriptors available, including CF, PF, Burden eigenvalue descriptor (or BCUTTM) and various scalar descriptors.
The following example shows how simple it is to setup molecular descriptors for your compound library. The first command creates a table called compound_library and the second command adds the molecules from an sdf file. The third command uses the 'c' option to create the molecular descriptor with the name of the structure table set by the -a flag, and the chemical fingerprint descriptor type set by the -k flag. The command omits the database login information that was stored previously with the -s option. See the jcman command options and the GenerateMD command options for more information. Creating and assigning molecular descriptors to database structure tables is discussed with the GenerateMD command.
jcman c compound_library
jcman a compound_library my_compound_group.sdf
generatemd c -a compound_library -k CF chemical_fingerprint
Below is an example that runs the similarity search with the new chemical fingerprint.
The molecular descriptor name, chemical_fingerprint, is
set as a search option and the similarity search is run normally:
...
JChemSearch searcher = new JChemSearch(); // Create searcher object
searcher.setQueryStructure(mol);
searcher.setConnectionHandler(conHandler);
searcher.setStructureTable("compound_library");
JChemSearcOptions searchOptions = new JChemSearchOptions();
searchOptions.setSearchType(JChemSearch.SIMILARITY);
searchOptions.setDescriptorName("chemical_fingerprint");
searchOptions.setDissimilarityThreshold(0.2);
searcher.setSearchOptions(searchOptions);
searcher.run();
...
...
//Start with database connection handler and name of the structure table.
MDTableHandler mdth = new MDTableHandler(connectionHandler, structureTableName);
String[] descriptor_ids = mdth.getMolecularDescriptors();
for (int x= 0; x < descriptor_ids.length; x++){
String mdName = descriptor_ids[x];
MolecularDescriptor descriptor = mdth.createMD(mdName);
//getting descriptor names:
String descriptorName = descriptor.getName();
//getting descriptor comments:
String descriptorComment = mdth.getMDComment(mdName);
//getting available metrics for each configuration:
String[] configNames = mdth.getMDConfigs(mdName);
for (int i=0; i < configNames.length; i++) {
MolecularDescriptor tempDesc=(MolecularDescriptor)descriptor.clone();
String config = mdth.getMDConfig(mdName,configNames[i]);
tempDesc.setScreeningConfiguration(config);
//getting metric name:
String metricName = tempDesc.getMetricName();
//getting default thresholds:
String defaultThreshold = tempDesc.getThreshold();
...
}
//Display code can go here
...
}
After selecting a molecular descriptor and other desired parameters, such as the descriptor configuration and the metric, the custom molecular descriptor name is set as a search option and the similarity search is run as normal. If the descriptor name, configuration or metric is omitted, a stored default value is used.
...
JChemSearch searcher = new JChemSearch(); // Create searcher object
searcher.setQueryStructure(mol);
searcher.setConnectionHandler(conHandler);
searcher.setStructureTable(structureTableName);
JChemSearcOptions searchOptions = new JChemSearchOptions();
searchOptions.setSearchType(JChemSearch.SIMILARITY);
searchOptions.setDescriptorName(selectedDescriptor);
searchOptions.setDescriptorConfig(selectedConfig);
searchOptions.setDissimilarityMetric(selectedMetric);
searchOptions.setDissimilarityThreshold(0.8);
searcher.setSearchOptions(searchOptions);
searcher.run();
...
More examples:
If a query is started when the number of searches
has exceeded the quota,
JChemSearch throws
MaxSearchFrequencyExceededException.
It is recommended to catch this exception and display
a friendly message advising the user to try searching
later. If this exception occurs frequently, please contact
ChemAxon
and request a license key allowing more searches.
Click here to display a table that
helps you to determine the access level that suits your needs.
chemaxon.struc.Molecule objects)
can be performed by the use of
chemaxon.sss.search.MolSearch or
chemaxon.sss.search.StandardizedMolSearch
classes.
If the files to be searchable are only available in a molecular file format
in a string or stored in the file system, they have to be imported into
Molecule objects by the use of
chemaxon.formats.MolImporter or chemaxon.util.MolHandler classes.
The code example at the
MolSearch API description shows examples for the use
of both classes.
Various Java examples for importing molecules using JChem API are available in Java and HTML format.
An easy to use command line tool for searching and comparing molecules in files, databases or given as SMILES strings is jcsearch.
MolSearch and StandardizedMolSearchMolecule objects
(a query and a target) to each other. Usually a MolSearch object
is used in the following scenario:
ms = new MolSearch(); // search object creation
queryMol.aromatize() //
aromatization
of query molecule
ms.setQuery(queryMol); // assignment of query to search
targetMol.aromatize() //
aromatization
of target molecule
ms.setTarget(targetMol); // assignment of target molecule to
search
ms.getSearchOptions().setSearchType(chemaxon.sss.SearchConstants.SUBSTRUCTURE); //
search type: SUBSTRUCTURE,
DUPLICATE, etc.
// set other search options. For more info, see MolSearchOptions and its superclass, SearchOptions
// search operation
StandardizedMolSearch, the aromatization steps can be removed,
as this class takes care of this internally.
The search operation can be one of the following:
| Search operation | Description |
ms.isMatching()
|
The most efficient way to decide whether there is a match between query and target. |
ms.findAll()
|
Looks for all occurrences(matching) of query in target. Returns
an array containing the matches as arrays (int[][]) or
null if there are no hits. The match arrays(int[])
contain the atom indexes of the target atoms that match the query
atoms (in the order of the appropriate query atoms).
|
ms.findFirst() and consecutive ms.findNext() calls
|
Same as findAll() above, but return individual match
arrays one by one. findFirst() re-initializes the
search object, and starts returning matches from the start.
Both return null, if there are no more hits to return.
|
ms.getMatchCount()
|
Returns the number of matchings between query and target. |
For further information, see the following resources:
chemaxon.struct.Molecule objects in the
following ways:
Molecule[] mols;
... //import molecules...
TreeSet smilesTree=new TreeSet(); //for faster searching
String[] smiles = new String[mols.length];
for (int i=0;i<mols.length;i++) {
- smiles[i] = mols[i].toFormat("smiles:u"); // create unique smiles
- if (!smilesTree.add(smiles[i])) { // process, if already contained
- ... //handle duplicates
- }
}
Molecule[] mols;
MolSearch ms = new MolSearch();
... //import molecules...
HashCode hc = new HashCode();
int[] codes = new int[mols.length];
for (int i=0;i
- codes[i] = hc.getHashCode(mols[i]);//generate hash code
for (int q=0;q<mols.length;q++)
- for (int t=q+1;t<mols.length;t++)
- if (codes[q]==codes[t]) {//if codes equal check with structure searching
- ms.setQuery(mols[q]);
- ms.setTarget(mols[t]);
- if (ms.isMatching()) {
- ...//handle duplicates
- }
- }
Tetrahedral centers and double bond stereo configurations are recognized during searching. The information applied by JChem for stereo recognition is
JChemSearch handles all reasonable structures
appropriately.
When the query structure is specified in MDL Molfile or Marvin mrv formats for
JChemSearch, and E/Z stereoisomers are searched, the stereo
search attribute (or stereo care flag) of the bonds has to be set. See the
relevant section of the Query
Guide. Furthermore, only the following formats supports the enhanced
stereo configuration of stereocenters: MDL extended (V3000) formats, Marvin
mrv, ChemAxon extended smiles/smarts. More details on these are available
at the following sources:
MolBond class
chemaxon.util.MolHandler object).