JChem requires Sun Java Runtime Environment (JRE) 1.4.2 JRE Standard Edition or better. Equivalent JREs from other vendors may work, but not recommended. For maximum performance we recommend to use the latest stable release from Sun Microsystems.
The most probable causes:
Make sure that the URL is appropriate (e.g: jdbc:oracle:thin:@myhost:1521:mySID). Check if all needed services run (the listener service is necessary).
If you would like to call SQL statements using ADO, you may choose between ODBC and OLEDB connections. (ADO can not contact to the databases through JDBC drivers.)
var adoConnectionString=
"DSN="+MyDSN+";"+
"UID="+username+";"+
"PWD="+password;
var adoConnectionString=
"Provider=MSDAORA.1;"+
"Data Source="+myServiceName+";"+
"User ID="+username+";"+
"Password="+password;
<jchem
dir>/examples/index.html for a description. The suggested way is to call JChem's ActiveX components from the C++ code.
Include <Tomcat home>/conf/tomcat-apache.conf
These are the steps that you should follow at upgrading:
jcman u
For example if you are running a JSP application in Tomcat:
This indicates that the table could not be loaded into the structure cache due to lack of sufficient memory.
There should be sufficient memory for the Java Virtual Machine (JVM) to load the structural and fingerprint information of all structure tables.
Please see the this section on how to estimate the memory need for your structure tables.
The out of memory section describes how to allocate more memory for the JVM.
Java applications in general:
In the case of most Java Virtual Machines, the
default setting of maximum heap size is 64MB. One can increase the maximum heap
size of applications running under Sun's environment by setting the
-Xmx parameter. General example for allowing 128 MBytes for an application:
java -Xmx128m my.Application
JChem applications:
In the case of the JChem application startup files
(Windows Batch Files and Unix Shell Scripts) an application-specific value
is specified in the startup file, which can be easily edited.
Please
click here
for further information.
Web applications:
If your problem occurs in Tomcat, please see the
Tomcat configuration page.
If you use a different servlet server, then please consult the
documentation of the software for details.
Please also see the "Memory" section of hardware requirements for more information.
Keyboard shortcuts for Copy/Paste/Cut functions
may vary by Look and Feel. For example in the case of Windows Look and
Feel these are CTRL+C,CTRL+V, and CTRL+X,
respectively. In the case of Motif LF the shortcuts for the same
commands are: CTRL+INS,SHIFT+INS, and SHIFT+DEL.
"I have tested Oracle 10g with JChem 3.1.3 using a database of 50 million
molecules which works without any problem (and very fast!) on an 4CPU SGI
Altix 330 12GB RAM machine."
Lutz Weber, OntoChem
In a single table the maximum number of rows is 231-1 (2,147,483,647).
We may increase this in the future if needed.
There is no practical limit for the number of tables.
Typically 1 million drug-like structures consume around 100 MB memory in
the structure cache of JChem.
Note: although JChem can drop the least recently used table from the structure
cache if low on memory, it is recommended that all structure tables should fit
in the cache (as cache loading can take a considerable amount of time).
When estimating the memory need simply sum the number of rows in the tables.
The following table shows typical memory needs in a benchmark test:
| Test specifications: | |
|---|---|
| Number of molecules: | 3,003,012 |
| Fingerprint size: | 16*32=512 bits |
| Average SMILES length per molecule: | 37.8 |
| Memory consumption: | 277.01 MB |
| Caching time*: | 159.2 seconds |
* The test configuration was exactly the same as in the cartridge benchmark.
Memory need increases with the number of molecules,the size of
fingerprints,
and the average SMILES string size. The following approximation can be
used:
Memory_need [bytes] = Number_of_molecules *
(0.5*Average_smiles_length[characters] + Fingerprint_size[bits]/8 + 13.5)
The structure table fingerprint statistics generation function can be used to report the average smiles length and fingerprint size of a JChem table or JChem index. See more information in the JChem Manager command line usage (s command), at the Cartridge index statistics function and the Statistics tab at Instant JChem Schema editor.
The total Java memory consumption (heap size) consists of:
To determine the total amount of RAM requirement you should add:
64 bit systems: A single Java process cannot allocate more than around 2 GB on 32 bit systems. If your Java memory needs will exceed this limit a 64 bit system is recommended (including hardware, operating system and Java).
Apart from the number of structures it also depends on the format of the input file, and type of RDBMS engine used by JChem Base.
A benchmark result with 1 million structures (the NCI dataset was multiplied for the test), the RDBMS was Oracle:
| Format | Size of JChem table | Original file size |
| SDF | 1.2 GB | 3.7 GB |
| SMILES | 270 MB | 50 MB |
Note: JChem compresses SDF in the cd_structure field by default (in the case of the NCI dataset to roughly about 4.4 times smaller). This can be disabled (e.g. if they need to be displayed directly by non-ChemAxon tools), but the storage size increases in this case.
Comparing the performance of different hardware architectures is a complex topic and not the subject of this FAQ.
Some quick facts:
The following benchmarks can be used as a starting point:
For the same database the search time greatly depends on the type of query structure. For a very general query (e.g. benzene) there will be a lot of hits, meaning longer execution time, while more specific queries run very fast on the same large database.
The search time consists ofThe following can be stated:
Tip: it is rarely useful to return a huge number of hits
(especially for human consumption).
If the
number of hits is limited, only
the rapid screening time will increase with table size, which means the total
search time will remain almost constant regardless of the table size.
The following tests demonstrate the speed of substructure search in
JChem.
The test configuration was exactly the same as in the
cartridge benchmark, and the same query structures
were used.
| Substructure search results, caching used: true | |||
|---|---|---|---|
| Screening hits* |
Screening time* ms |
Number of hits |
Total time ms |
| 0 | 113 | 0 | 119 |
| 0 | 179 | 0 | 194 |
| 12 | 94 | 12 | 125 |
| 204 | 104 | 204 | 134 |
| 900 | 90 | 864 | 225 |
| 936 | 90 | 936 | 169 |
| 1272 | 90 | 1188 | 182 |
| 1308 | 90 | 1188 | 240 |
| 1704 | 90 | 1560 | 208 |
| 1800 | 91 | 1764 | 196 |
| 4848 | 91 | 4608 | 334 |
| 65460 | 104 | 65208 | 2579 |
| 281292 | 121 | 274608 | 5830 |
| 472236 | 143 | 443508 | 11660 |
The following table shows the duration of import in some cases.
The configuration was exactly the same as in the
cartridge benchmark.
The -server JVM option was applied.
| Number of structures | Elapsed time | ||
|---|---|---|---|
| Duplicates allowed | Duplicates not allowed | ||
| 10,000 | 13 sec | 16 sec | |
| 100,000 | 1 min 9 sec | 1 min 56 sec | |
| 200,000 | 2 min 3 sec | 3 min 38 sec | |
Notes:
The following table shows the duration of search in JChem Cartridge
using
the following configuration:
Hardware (purchased in February of 2005):
Intel Quad CPU Q6600 2.40GHz desktop PC, 8GB memory, 2x750GB SATA hard drive in RAID 0 (with write cache enabled)
Software Environment:
| OS Name | Distribution Name | Kernel Version |
|---|---|---|
| Linux | CentOS release 4.6 | 2.6.9-67.0.7.ELsmp x86_64 |
| Oracle Version | SGA Max Size (MB) | SGA Target (MB) | DB Buffer Cache Size (MB) | Shared Pool Size (MB) | Java Pool Size (MB) | Large Pool Size (MB) |
|---|---|---|---|---|---|---|
| 10.2.0.3 | 0 (unset) | 1536 | 1024 | 160 | 160 | 16 |
| jvm_version | tomcat_version |
|---|---|
| 1.5.0_15 |
Target molecule set: NCI database August 2000 (250251 molecules multiplied to 3 million)
Number of structures: 3 million
| Session Date | 2008-05-01 |
| Operation Type | Query Structure | Number Of Hits | Total Time (ms) | SSS Time (ms) | Screened Count | Screening Time (ms) |
| t:s earlyResults:2000 | Clc1cncc2c(cnnc12)N3CC3 | 0 | 112 | 96 | 0 | 90 |
| t:s earlyResults:2000 | C1CN1c2cnnc3c(cncc23)C4=CSC=C4 | 0 | 201 | 186 | 0 | 179 |
| t:s earlyResults:2000 | CCSc1c(C=C(C=O)C#N)c2ccccn2c1C(O)=O | 12 | 119 | 99 | 12 | 89 |
| t:s earlyResults:2000 | O=C1ONC(N1c2ccccc2)-c3ccccc3 | 204 | 141 | 112 | 204 | 90 |
| t:s earlyResults:2000 | Nc1cc(cc2cc(c(N=N)c(O)c12)S(O)(=O)=O)S(O)(=O)=O | 864 | 320 | 255 | 900 | 89 |
| t:s earlyResults:2000 | NN=C1C(=O)NC(=S)N(C1=O)c2ccccc2O | 936 | 197 | 149 | 936 | 80 |
| t:s earlyResults:2000 | Oc1c(N=N)c(cc2cc(ccc12)S(O)(=O)=O)S(O)(=O)=O | 1188 | 280 | 219 | 1308 | 79 |
| t:s earlyResults:2000 | Cc1cc(C)nc(NS(=O)(=O)c2ccccc2)n1 | 1224 | 233 | 171 | 1272 | 79 |
| t:s earlyResults:2000 | COc1ccc2nc3cc(Cl)ccc3cc2c1 | 1560 | 269 | 195 | 1704 | 79 |
| t:s earlyResults:2000 | C(Sc1ncnc2ncnc12)-c3ccccc3 | 1764 | 277 | 197 | 1800 | 80 |
| t:s earlyResults:2000 | NC1=CC=NC2=C1C=CC(Cl)=C2 | 4632 | 434 | 339 | 4848 | 81 |
| t:s earlyResults:2000 | c1ncc2ncnc2n1 | 65208 | 3425 | 3331 | 65460 | 98 |
| t:s earlyResults:2000 | Clc1ccccc1 | 274608 | 10450 | 8028 | 281292 | 114 |
| t:s earlyResults:2000 | O=Cc1ccccc1 | 443508 | 17548 | 16511 | 472236 | 151 |
sep=! t:s!ctFilter:(PSA() <= 200) && (rotatableBondCount() <= 10) && (mass() <= 500) && (aromaticRingCount() <= 4) |
O=C1ONC(N1c2ccccc2)-c3ccccc3 | 204 | 313 | 290 | 204 | 84 |
sep=! t:s!ctFilter:(mass() <= 500) && (logP() <= 5) && (donorCount() <= 5) && (acceptorCount() <= 10) |
O=C1ONC(N1c2ccccc2)-c3ccccc3 | 36 | 380 | 350 | 204 | 80 |
| jc_tanimoto > 0.9 | O=C1ONC(N1c2ccccc2)-c3ccccc3 | 0 | 1137 | 1123 | 0 | 1123 |
| jc_tanimoto > 0.9 | CCSc1c(C=C(C=O)C#N)c2ccccn2c1C(O)=O | 0 | 1162 | 1146 | 0 | 1146 |
| jc_tanimoto > 0.9 | Nc1cc(cc2cc(c(N=N)c(O)c12)S(O)(=O)=O)S(O)(=O)=O | 24 | 1195 | 1136 | 24 | 1136 |
jc_idxtype-indexed regular
structure table: 1 minutes 20 seconds
INSERT INTO mytable VALUES(<structure>);
jc_insert('<structure>', 'mytable', null, 'false', 'false');
jc_insert('SELECT structure FROM tmptable', 'mytable', null, 'false', 'false');
The following table shows the duration of indexing in some cases.
The configuration was exactly the same as in the
cartridge benchmark.
| Number of structures | Elapsed time |
|---|---|
| 10,000 | 10 sec |
| 100,000 | 45 sec |
| 200,000 | 1 min 22 sec |
Here are some URL settings for JDBC connections to different database
engines.
(For more complex cases please see the documentation of the JDBC
driver.)
Note: we recommend ODBC only for Microsoft Access. For other databases
the native JDBC drivers should be used.
| Database | JDBC driver | URL format | Example |
|---|---|---|---|
Oracle (thin driver) |
oracle.jdbc.driver.OracleDriver |
jdbc:oracle:thin:@[host address]:[port]:[database sid] |
jdbc:oracle:thin:@localhost:1521:mydb |
Oracle (OCI8 driver) |
oracle.jdbc.driver.OracleDriver |
jdbc:oracle:oci8:@[host]:[port]:[database sid] |
jdbc:oracle:oci8:@localhost:1521:mydb |
MySQL |
com.mysql.jdbc.Driver |
jdbc:mysql://[host]/[database] |
jdbc:mysql://localhost/mydb |
MS Access via ODBC |
sun.jdbc.odbc.JdbcOdbcDriver |
jdbc:odbc:[odbc data source] |
jdbc:odbc:mydatasource |
PostgreSQL |
org.postgresql.Driver |
jdbc:postgresql://[host]:[port]/[database] |
jdbc:postgresql://localhost:5432/mydb |
DB2 |
COM.ibm.db2.jdbc.net.DB2Driver |
jdbc:db2://[host]/[database] |
jdbc:db2://localhost/mydb |
InterBase |
interbase.interclient.Driver |
jdbc:interbase:[path to interbase data file i.e. the .gdb file] |
jdbc:interbase://localhost/c:/interbase/interbasedb.gdb |
MS SQLServer |
com.microsoft.jdbc.sqlserver.SQLServerDriver |
jdbc:microsoft:sqlserver://[host]:[port];DatabaseName=[database];SelectMethod=cursor |
jdbc:microsoft:sqlserver://localhost:1433;DatabaseName=mydb;SelectMethod=cursor |
MS SQLServer 2005 |
com.microsoft.sqlserver.jdbc.SQLServerDriver |
jdbc:sqlserver://[host]:[port];databaseName=[database];SelectMethod=cursor |
jdbc:sqlserver://localhost:1433;databaseName=mydb;SelectMethod=cursor |
HSQLDB / HXSQL |
org.hsqldb.jdbcDriver |
jdbc:hsqldb:hsql://[host]/[database] |
jdbc:hsqldb:hsql://localhost/ |
Derby |
org.apache.derby.jdbc.EmbeddedDriver |
jdbc:derby:[database];create=true |
jdbc:derby:mydb;create=true |
In the case of database input, one should append " ORDER BY CD_ID" to the SQL query. For compr only the second query should be modified.
Sun has removed the ActiveX Bridge support from JDK/JRE 1.4, and replaced it with a different version from 1.4.2 and later. We are planning to provide support this new version. Until then we suggest to use JRE version 1.3.1. for ASP integration.
Please increase the value of the "max_allowed_packet" variable for MySQL. The following line should be added to the configuration file "my.ini" under the [mysqld] section :
max_allowed_packet = 100M
JChem works with the standardized form of imported structures stored in ChemAxon Extended SMILES format. This extended format can represent a wider range of structures than SMILES, but there are still some cases, when this format is not applicable. In these cases the "cd_smiles" field is null, and JChem uses the "cd_structure" field for these rows. (The "cd_structure" represents the structures in the original input format)
Currently the cd_smiles is null in the following cases:
In these cases the search is slower, since the target structures have to be
standardized on the fly.
Note: For most databases the size of the "cd_smiles" field can be increased
at the table
creation dialog (in the SQL text). The increased length is automatically
utilized.
This can speed up the search if a high percentage of the structures are
huge.
Sometimes there are some changes in the data structure of JChem, which are incompatible with earlier versions. To obtain correct search result, the regeneration of the old structure tables is necessary. For more information and instructions please see the administration guide.
It is possible that JCMAN GUI does not work on Linux server with Windows client via X sever in some cases. Opening your ssh connection with -Y option may solve the problem.
Related forum topic: http://www.chemaxon.com/forum/viewtopic.php?p=15322#15322
If your question is not answered, please check out our forum or .