The query builder panel is is not shown by default, but can be shown using the the Window -> Query Builder menu item. It will appear by default in the lower left corner of the IJC window.
A query is built from "query terms" which are described below.
This is the basic query condition used in a query. It comprises a simple query condition such as 'MolWeight < 300', but also includes structure search terms (including Chemical Terms filters). Once added you typically select from the available operators (e.g. '<' in the above example) and enter the value(s) ('300' in the above example). The field is pre-determined when you add the query element and cannot be changed.
In future IJC will support other types of constructs such as Field operator Field e.g. Assay1_IC50 < Assay2_IC50.
For each standard field present in a table, the field name is displayed along with a drop-down list for defining the query operator (=, <, <=, Between, etc.) as well as one or more elements for specifying the values for the query.
Specify the appropriate operator in the drop-down list and enter the value(s) for the query in text box(es).
Most operators take a single value. Exceptions are:
Widlcard searches are possible for text fields using the 'like' operator. Use the appropriate wildcard character for the type of database (usually % for zero or more characters and _ for a single character). For instance 'like amimo%' will find all values starting with the term 'amino'. Shortcut operators are provided for some commonly used wildcard searches:
You can search for values that are missing or present using the 'Is Null' and 'Is Not Null' operators. For instance if you use the 'Is null' operator you will fetch all rows for that field that do not have a value defined.
The structure query features of IJC are provided by the Marvin and JChem toolkits. See these links for detailed documentation on these features:
With structure fields you can specify queries of type:
In order to edit the structure of the queried field, double-click the structure panel to open Marvin Sketch.
Check the 'Return non-hits' check box if you want to reverse the meaning of the search e.g. find all the structures that don't match the specified structure query.
The different types of search operators have different sets of options.
Default options are specified and are often OK, but you may want to fine tune
how the search executes by specifying different options.
To define the options click on the options button
(
)
for a dialog that allows you to specify advanced searching settings, such as
stereochemistry options and similarity search threshold.
Duplicate search options
Duplicate search has a very limited set of options, just those that allow stereochemistry to be turned off and to enable tautomer searching.
Full, Full fragment, Substructure and Superstructure search options
These search types have a wide reange of options to control the streochemistry, atom matching, bond matching and tautomer search options. See the screenshot below.
Similarity search options
Similarity search has quite different options to the other search types. The basic option to specify is the similarity threshold, a number between 0 and 1, where 0 is completely dissimilar and 1 is identical.
In addition to the threshold you can specify a Screening Configuration to use. For normal tables containing molecules the default is Tanimoto distance, but other metrics are available and can be selected from the drop down list. The most interesting of these is Tversky, which has some additional parameters that can be specified. These are entered into the text box. For Tversky two parameters are needed:
These are entered as comma sepatated values as shown in the screenshot below.
Other metric types either do not have parameters or there parameters are hard coded special cases of Tversky (e.g. DICE is Tversky with query weight and target weight both being equal to 0.5).
The Screening Configuration is specific to the type of strucutre table. Reaction tables have a different set of metrics which allow the type of similarity to be defined. The options are:
With structures you can also specify a Chemical Terms filter that can be applied
to the query. To do this, enter the Chemical Terms expression into the Chemical Terms
filter box located beneath the Marvin Sketch panel; alternately, click on the advanced
button (
) to open the Chemical Terms
editor which will allow you to enter the expression or use one of the pre-defined
favourites. This filter is applied to each result of the search and used as an
additional filter for the search results. An example would be to retrieve only
structures that have a logP of less that 5 by entering the expression
logP() < 5.
Note: Chemical Terms filters are applied dynamically to the query
results. If you have lots of results the search will be much slower with a Chemical
Terms expression as part of the query. If you are frequently using the same Chemical
Terms expressions, you should probably use a Chemical Terms Field instead so that the
values are present in the database and so can be queried directly without being
recalculated each time a query is run.
To be executed all elements of the query must be valid. When a term is first added to the query it may be in an invalid state because you have not specified the requried values.
The elements of the query term you can specify depend on the Field type. Typically you will specify the operator and one or more values. Once the terms have been correctly specified the query element will be valid.
Repeat this for all the Fields which you want to include in the complete query. If you wish to exclude a Field from the query set its operator to 'Ignore'.
Any part of the query can be collapsed to take up less space. Collapsed elements display a text summary of the current query criteria. Elements that are set to ignore are displayed as collapsed by default.
This is a composite query term that contains simple query conditions or other composite
query terms as child elements. As implied by the name all the child conditions must
apply for for this type of query element. e.g. it
has a meaning like this: MolWeight < 300 AND logP > 2.
This type of query term must be added to another AND or OR parent element, or be the root element
of the query tree. To add and AND query element you select the Entity for which you want
the AND to apply (see house rule #1) from the Entity selector in the control panel and
click on the 'AND' button. This allows you to build up relational queries where conditions
from mulitple Entities are part of the total query.
The new AND query term will be added to (or after) the
currently selected element in the query tree.
This is a composite query term that is very similar to the AND query term. The difference
is that only one of the child elements needs to match for the query to apply. So the equivalet to
the previous example would be MolWeight < 300 OR logP > 2, meaning that only one
of the conditions need to be true, not all of them.
All the elements in the query tree can be expanded or collapsed as needed. Expanding showns the full details, and allows editing. Collapsing provides a descriptive summary that allows a more compact display of the query.
Copyright © 1998-2009 ChemAxon Ltd. All rights reserved.