Standardizer

Structure canonicalization and more

Chemical compounds can appear in various forms depending on the source and even on the habits of the chemist creating the representations. These differences affect not only the graphical appearance of the molecules but can influence more fundamental details of the topology. Various resonance structures, tautomers, salts and solvents might appear in the representations, making compound identification even more problematic. Standardizer is ChemAxon’s solution to transform chemical structures into customized, canonical representations to achieve best reliability with chemical databases.

Product Type:component
Interfaces:CLIGUI ( Desktop )API ( Java, .NET )

Use standardization actions to get uniform structure representations

Robust searching with consistent representations

Certain patterns in chemical structures can occur in various forms that, depending on the search conditions, can impair structure based searching. A typical example is the nitro group that can be present in its charged or neutral form. Standardizer’s main purpose is to transform chemical structures into representations that obey certain chemical business rules to avoid such inconsistencies in a chemical database.

Structure matching

Create canonical structures

The uniformization of structures can change their molecular graph. These modifications are the most invasive and have the most influence on the search process. Modifications may include among others the addition or removal of explicit hydrogen atoms or the neutralization of charged fragments and functional groups. Representations of functional groups commonly used in old databases (e.g. aliases) can also be recognized and converted by Standardizer. Besides the graph modifications, removal of certain fragments, e.g. water and salt counterions is also possible with standardization actions.

Standardizer actions

Unify graphical representations

Besides topological modifications the graphical representation of compounds is also essential for everyday research. Aligned orientation of compounds, clean structures or uniform relative arrangements of fragments can help chemists to browse and recognize compounds. Standardization actions such as 2D cleaning and expanding abbreviated groups make the structures easier to read for the user. Unifying the orientation of the compounds by template based cleaning makes chemical libraries transparent and clean.

2D Align

An easy way to fully customized databases

Create custom transformations

Due to an advanced transformation engine under the hood, custom standardization actions can be defined, that can perform any sort of transformation on compounds. These can be for instance removal or replacement of atoms, functional groups or patterns in the structures. Whereas predefined standardization actions have limited customization options, this feature gives almost unlimited freedom in defining canonicalization rules.

Custom Transformation

Standardized registration

Identification of duplicates on registration as well as a consistent representation of the compounds is essential in corporate databases, therefore the standardization process is a key component of most registration systems. While storing all registered compounds in a compact, canonicalized form, the original input compounds can be kept as well restoring all input information coming from the chemist. This way any modifications in the standardization configuration can be applied on the original input structures.

Registration architecture

Easily accessible

Standardizer is available as a standalone, Java based application. It is platform independent and can be used via a wizard-like graphical user interface or through the batch mode. As all other ChemAxon applications, Standardizer also has a full featured Application Programming Interface (API) in Java and in .NET, making this solution integratable into in-house or third-party applications. Workflow management tools, like KNIME and PipelinePilot also integrate the Standardizer engine.

Standardizer GUI

Articles in the library

Small Molecules in Big Data: Proceed with Caution!

Sep 14, 2016 - Presentation
The Schürer research group at The University of Miami is one of three sites for the Big Data to Knowledge (BD2K) Data Coordination Integration Center (DCIC) of the Library of Integrated …

Chemistry-enriched patent curation - automatized chemical and semantic analy…

May 20, 2015 - Presentation
Currently, analysis of large patent sets is a tedious and cumbersome work. In order to improve and speed up this process we developed a patent curation-workflow, in which relevant chemica…

Increase compound collection value and diversity through collaborations, par…

May 20, 2015 - Presentation
Pharmaceutical R&D is facing increasing pressure such as rising operational costs, depleted pipelines and patent expiries. In this environment, it is crucial to improve R&D pr…

Still have questions?

Have a look on our support forum or drop us a line