Chemicalize.org: A uniquely user-selected PubChem source of structures extracted from text
The chemicalize.org open web application recognizes different types of chemical names in any text source and converts them into structures (the example of a new Wikipedia antimalarial lELQ-300 is shown on the right). It can thus both disinter and connect between millions of structures from their “tombs” of patents, papers, abstracts or web pages. The service has generated a searchable database of ~300,000 unique structures from user results since 2009. The Webpage Viewer and Document Viewer save visited URLs along with the extracted structures. Because the philosophy of chemicalize.org is to make chemistry more accessible this archive is deposited into PubChem and updated. It is unique in being the only source derived from user-selected content (i.e. documents and URLs are actively chosen, typically via the protein target and/or disease indication). The accumulated structures are thus ”collectively crowd-sourced” and link back to chemicalize.org data pages. These include predicted properties such as pKa, logP/D, all of which can be downloaded along with SDF, SMILES, IUPAC names, and InChI. The figures below give an introduction to the functionality and exploitation synergies with PubChem.