Methods for robust and efficient tautomer enumeration, tautomer searching and tautomer duplicate filtering
Tautomerism is an important and difficult problem in cheminformatics, and has gained much attention recently. The presentation will focus on ChemAxon’s approaches and algorithms for handling tautomerism.
There are four main topics to cover:
1. The tautomerization calculator plugin is the basis of most methods. It can identify tautomerizable regions, enumerate all or dominant tautomers and predict the distribution of dominant tautomers. Furthermore, it can provide generic and canonical tautomers that are used by the methods discussed. It first identifies possible proton donors and acceptors and finds the tautomerization paths between them. Depending on the desired operation, it then combines the paths into regions (generic tautomer), combinatorially enumerates all possible tautomeric forms (all tautomers), filters and ranks enumerated structures based on pKa and other criteria (dominant tautomers) or canonicalizes using empirical rules (canonical tautomer). The tautomerization plugin is also used to improve results of other calculations, such as macro pKa and logP.
2. Tautomer duplicate search uses generic tautomers combined with a hash key. This method also allows fast filtering of tautomers in chemical database tables. It will be shown how this method is able to handle tautomeric migration of H isotopes and interactions with stereochemistry.
3. Tautomer substructure search enumerates tautomers of the query, and searches each of them separately. In case of query H constraints (explicit H), the constraint is enforced on the tautomeric region to retrieve only true tautomers.
4. Standardizer is a tool for performing custom and built-in transformations on molecules. It is integrated with the JChem chemical database system, so that database and query structures are automatically transformed by the specified transformations. It will be shown how the canonical tautomer and custom transformations can be used to handle tautomerism. Custom transformations also allow handling of ring-chain tautomerism.