Computer-Assisted Document Curation and Analysis
ChemCurator is a desktop application to extract chemical information including compounds, Markush structures and related assay data from English, Chinese and Japanese patents, journal articles, and other documents. Combining ChemAxon’s text mining, structure handling and Markush technology, this application can extract and highlight structures within documents, allowing users to interactively extract chemical information or assemble the Markush structures in a semi-automated way.
Why did we develop ChemCurator?
The competitive landscape is essential to the pharmaceutical industry, because of the significant intellectual property associated with the development of a new drug. However, analyzing patents and journal articles can be a very labor intensive task. Numerous examples and assay data are scattered across a document, and patents usually use complex Markush structures to define very broad chemical spaces. Extracting useful information can be time consuming even for experienced information scientists. That is why ChemAxon developed ChemCurator to help speed up this process.
How did we develop ChemCurator?
ChemAxon has long been developing informatics tools for IP and life science researchers. Document to Structure can automatically extract chemical names from all types of documents; while Markush Technology can create, enumerate and search complex Markush structures. ChemCurator combines these existing technologies in a new, intuitive, domain specific solution to help expedite high quality document curation.
What can ChemCurator do?
ChemCurator can open a patent document, highlighting structures from texts and images including exemplified structures, R-group fragments, etc. The structures can be interactively extracted and displayed with links to their original location in the document, which makes navigation easier and more intuitive.
A unique feature of ChemCurator is that users can simply drag-and-drop recognized structures and fragments to populate R-groups, and quickly re-assemble the Markush structures according to the claims. Once the Markush structure is built, exemplified structures can be used for validation, thus ChemCurator can check whether they are within the chemical space or not.
We developed ChemCurator as a general chemistry text mining tool. It not only works with patent documents, but can extract information from any scientific document, such as journal articles, internal reports, presentations, etc. Also, with our Asian language support, it can also work with Chinese and Japanese documents.
Who should use ChemCurator and what’s coming?
We believe ChemCurator can be a very useful chemistry text mining tool for information scientists, bench chemists, computational chemists, patent attorneys, and other scientists who are interested in IP knowledge management. Although ChemCurator cannot completely eliminate human intervention, it can greatly reduce the processing time. We are continuously working to improve ChemCurator, reduce the need of manual work and automate the curation process as much as possible. Your feedback is essential for us to reach this goal. If you have any questions or suggestions please do not hesitate to share with us.