Post subject: JChem search exact search speed issue
Posted: Tue Dec 18, 2007 4:26 pm
If we switch off caching mode the exact structure search becomes 10 times faster. (I know it is deprecated)
With caching mode the speed is 700-800 ms for one molecule, we would expect about 100ms for one molecule.
We have 7 000 000 molecules in our structure table.
Here is the log from JChem:
with cache:
Tue Dec 18 16:01:28 CET 2007
Search mode: EXACT
Structure table: DBO.MOLECULES
Query: [#7]
Screened: 1
Hits: 1
Total time: 727 ms Screening: 696 ms
Processing threads: 2
Current / peak / maximum searches per minute: 9 / 9 / Unlimited
no cache:
Tue Dec 18 16:30:59 CET 2007
Search mode: EXACT
Structure table: DBO.MOLECULES
Query: [#8-]
Screened: 1
Hits: 1
Total time: 93 ms Screening: 23 ms
Processing threads: 2
Current / peak / maximum searches per minute: 9 / 9 / Unlimited
Any idea?
Thanks
Gabor
Szilard
Joined: 21 May 2004
Posts: 935
ChemAxon personnel
The difference is in the screening time (the phase for selecting hit candidates for the slower graph search).
The discrepancy is due to a "trick" we apply in this phase:
The cd_hash column in the database table is normally used for speeding up duplicate filtering (PERFECT search).
This cannot be used for EXACT search in general, as the hits are not necessarily identical (e.g. a "single-or-double" bond should find both, an "any atom" can match on anything).
If the query atom does not have such features, we can "cheat" and use the hash code.
This speedup is currently not applied in cached mode, this explains the discrepancy in the search times.
We are planning to improve on this in the future.
By the way do you use the EXACT search for finding duplicate structures ?
In that case I recommend PERFECT search mode, which is specifically designed for handling this.
Please see the chemistry differences in our Query Guide:
http://www.chemaxon.com/jchem/doc/user/Query.html#otherSearchTypes
The search time should be similar to your faster measurement.
Best regards,
Szilard
Szilard
Joined: 21 May 2004
Posts: 935
ChemAxon personnel
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum You cannot attach files in this forum You can download files in this forum