2009 US User Group Meeting

Sept. 15-16, 2009 & Training day Sept. 14, San Diego, CA.

Request information for future events
Lead-on PB

Lead-in Clustering
Marvin Markush Structures
JChem Name-to-structure and Structure-to-name
JChem for Excel ChemAxon in 3D
JChem beyond Java Partner Presentations
Command-Line Tools and Pipelining Summary
Reactor and Metabolizer UGM archive – all presentations

Summary
The meeting highlighted the extraordinary talent and hard work of the ChemAxon staff in making their software so powerful and yet so easily integrated into many different applications.
In addition, it provided hints of fundamental changes that will come to end-users of cheminformatics tools. For example with ChemAxon tools embedded in workflow tools, it is much easier for computational chemists and expert users to perform a sequential variety of tasks for some purpose. However, such a workflow can be encapsulated and deployed to bench chemists.
Of course the web has changed the way that many things are done. ChemAxon has not ignored this trend and increased its presence in this area. The URL fields data type in Instant JChem supports accessing web resources within a desktop application, returning text or an image. Web browsing of the scientific literature could be transformed by functions such as seen in the prototype Chemicalize.org, which finds chemical names within a web document and shows the corresponding structure with a link to property predictions.
ChemAxon’s work to integrate into SharePoint will enable full cheminformatics capabilities within Sharepoint Blogs, Wikis, and discussion boards. This clearly could transform the way that discussions within a team occur. In short, this was an exciting meeting that showed solid progress, fruitful collaborations, easy integration of ChemAxon software and suggestions of exciting capabilities in the future.





Lead-in · return to TOC
The 2009 North American ChemAxon User Group Meeting was held at the Catamaran Resort and Spa in San Diego, California, September 15-16. As is usual for ChemAxon, the site was wonderful, the food delicious, the social functions extraordinary, and the talks stimulating. The resort is on Mission Bay, which afforded swimming, beach volleyball, and opportunities for boating during non-session times. Preceding the meeting itself we enjoyed a beach party along Mission Bay. The banquet evening involved a bus ride to the Manchester Grand Hyatt, where we enjoyed champagne at the Top of the Hyatt bar, which offers a spectacular view of the harbor, and as the sun set, of the lights of the city. For dinner we walked a few blocks to the Strip Club, which is named for their specialty strip steaks. Nick-named by Alex Drijver “The IKEA of Restaurants”, it features strategically placed large gas grills on which we cooked our protein. This made for an atmosphere in which it was easy to mix with other attendees. Lastly, after the meeting closed on Wednesday, those who remained boarded yet another bus to go to a Mexican restaurant in Old Town. It featured authentic Mexican food, including delicious guacamole and various margaritas. As usual the whole meeting had a relaxed but festive feeling, which encouraged interaction between participants, whether they are from ChemAxon or not. In spite of travel restrictions and employee reductions at many companies, the attendance of non-ChemAxon participants at the meeting remained steady at 75. Of these, 14 were from major pharmaceutical companies, 13 from ten universities and research centers, and five from content providers. The remainder was made up of partners or other interested parties. Eight traveled to North America for the meeting: one each from Europe and China, and 5 from Japan. The continued high level of participation at the meetings is an indication of the power and utility of the ChemAxon software and the interest that users and partners have in both the use to which others are making the software as well as the new capabilities that are soon to be released or in development.

The meeting was again preceded by two workshops (neither of which I attended); one was for end-users and the other was for developers. Comments of the attendees were positive.

Of course, at meetings like this, information is traded informally at the breaks and meals. For example, did you know that the ligand search and display and the structure display on the PDB (Protein Data Bank) web page are powered by ChemAxon tools? One participant wondered why anyone would use anything other than JChem for Excel for a chemically aware Excel spreadsheet. Several participants were considering the JChem Cartridge as a replacement for or alternative to the Daylight cartridge.

There were a total of 13 presentations by ChemAxon, 13 by various users and partners, and 13 additional short presentations by partners.

The formal meeting was started by a presentation by Alex Drijver, the CEO of ChemAxon. He reported that the staff has grown substantially. Developers remain the majority of staff. The company has unusually low turnover of employees, attributed by some to the beauty of Hungarian women (recall the Gabor sisters) – these days they will not move to another country. Income for 2008 was up significantly and 2009 is also expected to show an increase.

Alex reminded us that aside from exceptional customer service, an important strength of the company is its focus on components and toolkits. This allows the customer to decide what is needed and how to deploy the capabilities to support their unique environment and needs. A further strength of the company is that 35 partners incorporate ChemAxon software into their product or provide customization for users. Thus, any investment in ChemAxon software can be leveraged when new needs arise. He also emphasized that with 40 developers, development is on-going. This year the new capabilities are JChem for Excel, JChem Web Services, and Metabolizer. Although the traditional language for ChemAxon software is Java, applications are now also written for JChem for Excel .NET, OLE, and Sharepoint. A web services API and a .NET API are also available. Alex reminded us that special niches of ChemAxon software are Markush and LibMCS. The company is also investigating the possible deployment of SaaS (Software as a Service) and Microsoft SharePoint.

I was next on the schedule with a talk entitled “Tautomers: A Rant“. In my opinion, tautomerization presents a challenge for all aspects of computer-aided molecular design. Not only do subtle changes in molecular structure affect the ratio of tautomers, but this ratio also depends on the solvent or binding site environment. On a practical level, ignoring the possibility of tautomerization provides erroneous estimates of molecular similarity, which in turn affects the reliability of clustering or diversity selection of molecules. Physical property measurements such as octanol-water log P or pKa may not identify the relevant tautomers or tautomer ratio, which in turn affects the reliability of algorithms to calculate these properties. Most challenging is to account for tautomerization when calculating ligand-macromolecule binding. The informatics challenge is how to build a database system such that the relationships between tautomers are kept and that tautomer ratios can be easily recorded and searched. The computational challenge is to develop software that predicts the possibility of a ring-chain tautomerization in a molecule.

Pat Walters from Vertex described their ASAP system that emphasizes multidimensional drug discovery. It provides an intuitive overview of all compound data, for example with heat maps, while also enabling seamless viewing of the detailed data. Biological properties are grouped as affinity, selectivity, activity, properties (solubility or hERG, etc), or PK. An interesting feature is the ability to select and always display the properties of the best or reference compound for a particular project. To elucidate structure-activity relationships they provide a tool to add a column that is colored if a particular substructure is present in the molecule and also an R-group table derived from a user-entered core.

Brett Hiemenz from GSK reported on their efforts in Service Oriented Architecture, SOA. Because of mergers, the company has applications in many different environments and needed to unify them so that scientists can easily access all relevant information. The current system has many web services from multiple vendors. They serve a variety of clients including the Inforsense workflow, ACD and BioByte calculations, as well as Symyx, Daylight and ChemAxon chemical cartridges. Currently each day the application serves 8,000 chemical structure lookups, 11,000 structure format transformations, 10,500 simple property calculations, and 1,300 inventory transactions. The benefits of the architecture are that it is both backward compatible and upgradeable.

Marvin · return to TOC
Akos Papp from ChemAxon described and demonstrated many of the new features in Marvin. Users will appreciate the new Template library. With transparent structure painting, structures can be shown as an OLE object, or as EMF, SVG, or PNG background images. There are also improvements in the OLE object for annotating chemical structures. In Marvin one can now put charges on the graphical resizable brackets that enclose several atoms, The periodic table window is colored by various criteria. The new release shows considerable speedup in the MarvinView spreadsheet with large files, and in the applet loading. With Reaxys Generics in MarvinSketch 5.2.1 one can specify generic atoms and groups. (MarvinSketch is now the default editor of the new Reaxys system that replaces Beilstein Crossfire.) Sprouting groups is a feature in development – the sprouting can generate a connection bond, form a spiro ring, or fuse two rings. Version 5.3 will support SKC import/export and CDX export. It is also planned to incorporate a structure checker, chemical terms editor, projected drawings, and multi-record file import into MarvinSketch 5.3. Lastly, there is now a .NET version of the MarvinSketch and MarvinView GUIs.

Hongzhou Zhang from Eli Lilly and Company described their SPrime (System for Predictive In-silico Method), a computational framework for drug discovery. It consolidates various in silico modeling and cheminformatics tools in a desktop application for bench scientists. It uses a unified data structure representation and a standardized plug-in to add new features. Core components of the system are model/compound set trees, multiple types of models/tools and compound sets, compound set manipulation, set annotation capability, and charting functionality linked to data grid. SPrime involves a hybrid environment with multiple languages, commercial and open source components. They used JNBridge technology to embed Marvin inside Winform. The Marvin plug-in supplies log P, log D, and pKa models. SPrime includes approximately 500 registered models, displayed in tree form, that can be private, available to only one group, or public. SPrime includes tools for structure fragmentation, reaction based enumeration, scaffold based clustering, property explorer, principal components analysis of chemical space, pharmacophore model fitting, docking, and 3D overlay.

Tianhong Zhang from Pfizer discussed using ChemAxon components (MarvinSketch, MarvinView, Calculator Plugins and Marvin API) in their system for macromolecular structure notation, editing, and registration. The problem is that a typical macromolecule structure representation relies on a library of possible repeating units that are often represented by a single letter. However, modified biological polymers may incorporate an additional substructure or may link two types of polymer together. Traditional polymer notation systems do not describe the details of the connections between disparate groups, but full atom representations as are used for small molecules are too unwieldy for most uses. Hence Pfizer decided to design a polymer notation language that can describe the structure in unambiguous but simple terms and yet be expanded to full atom/bond representation when needed. The language was developed using ChemAxon extension of SMILES where R-groups are used to indicate attachment points. The corresponding macromolecule editor incorporates structure drawing and visualization, the monomer manager, and the notation toolkit, which also supports property calculations. Registration is handled in a parallel fashion.

JChem · return to TOC
Tim Dudgeon of ChemAxon reported on what’s new and what’s coming in Instant JChem. Version 2.5.4 just released and is the end of the 2.5 series. Additionally Version 3.0, a new feature release, is also available. The new version includes reaction-based enumeration. An interesting new feature is the addition of URL fields that can link out to external content, data or image. The URL can return static or dynamic information that is generated from a field in Instant JChem. The results from the URL can be in form or grid view. Link out to applications on the web, such as Google or ChemSpider, is also supported. The new features include a structure renderer that uses text fields such as chemical terms expressions to display tautomers or stereoisomers or Bemis Murcko scaffolds. In addition, Version 3.0 supports printing several forms on one page. The improved schema editor makes it easier to connect to a pre-formed database and the data tree and schema editor are combined. One can now control the ability to edit specific columns or tables. Also new is the ability to copy forms from one data tree to another. Version 3.1, to be released later this year, will provide an IJC server. This release is the main focus of the development team.

John McNeil of John McNeil & Co reported on cost effective cheminformatics for small chemistry teams integrated within larger discovery groups. The issue at Ambrx was how to register and track the structures, batches, and analytical data for the non-native amino acids that they incorporate into proteins. The company uses LabSynch as an electronic laboratory notebook, process management system, collaboration web service, tracking system, and data capture and analysis tool. However, LabSynch database is not chemistry aware and the three chemists at Ambrx needed a registration and batch tracking system that also supports substructure searching. The solution was to design a structure and batch registration system based on JChem running on Tomcat. Adding structure items to LabSynch was straightforward using JChem Base and the LabSynch API. The whole system was very cost-effective.

Szabolcs Csepregi from ChemAxon described the new features of the JChem back-end, both JChem Base and JChem Cartridge, which is a wrapper around JChem Base. It now supports many chemical file formats including IUPAC and traditional names, InChI, CDX, etc. and will soon support the coming SKC structure input. New is support for a .NET API and JChem Web Services. Version 5.1.X supports variable positions in search queries. In Version 5.2.X, the current version, there is polymer storage and search including ladder and copolymers; support for searching of attached data and repeating units; and Dice, Euclidean and Tversky similarity metrics. Upcoming searching options include homology (generic) groups such as alkyl and aryl. JChem Base will also see an integration of the R-group decomposition capability. In the cartridge, one will be able to do similarity searches on user-defined fingerprints.

Yingyao Zhou from the Genomics Institute of the Novartis Research Foundation highlighted the issue of educating users so that they achieve the best results from structure searches. Hence, they developed an educational tool for their Lead Discovery Database, LDDB. LDDB supports compound management, HTS, analytical and medicinal chemistry, pharmacology, and program management. It contains 3 million compounds, 250 million results of HTS, and data on 180,000 compounds in lead optimization. It uses many of the ChemAxon tools. The tutorial is now integrated into LDDB with a “Test Query Structure” link. The tutorial also includes pre-design search examples to help users design their own queries.

Julio Carneiro from SkunkWerks Software discussed using Adobe Flex to add JChem WebServices to their original screening library system – this now provides substructure searching over inventory, HTS data, and the chemical registry via the web. They converted the client from ChemOffice to ChemAxon because of the excellent technical support and open architecture. The application for the Centre for Drug Research and Development (CDRD) in British Columbia uses a CDRD database server, a CDRD web server, JChem Web Services on Tomcat, and JChem Base. The final note was that ChemAxon provided excellent technical support that allowed SkunkWerks non-chemist programmers to incorporate chemistry easily into an HTS application.

Craig Knox of the University of Alberta reported on how they built a central structure hub for DrugBank (6568 structures), the Human Metabolome Database (7982 structures), FooDB (2000 structures of food components), and T3DB (3000 structures of toxins). The problem was that the original process to update a structure database was cumbersome and updates were not immediately available. To update the system they integrated JChemBase with Ruby on Rails using JRuby to bridge the Java and Ruby world. The ability to use Chemical Terms is an attractive feature with JChem. Ruby on Rails is an open source web application framework for the Ruby object-oriented programming language. Ruby on Rails provides web developers support for rapid development. JRuby implements Ruby atop the Java Virtual machine. As such it runs Ruby on Rails and Java code, JChem for example. With this architecture it is possible to create, read, update and delete structures with an extremely simple set of actions. The talk finished with example screen shots of several of the databases. The system is now operational in all four databases.

JChem for Excel · return to TOC
Tamas Pelcz of ChemAxon presented JChem for Excel, which was released last year. Since then the way that structures are handled has led to major improvements in performance, structures move and resize with cells, there is an undo for structure delete and structure edit. New in 1.1.2 is the inclusion of Marvin .NET; the ability to convert SMILES or IUPAC names to structures; and the ability to copy-and-paste to and from Marvin, ChemDraw, ISISDraw, other OLE Office applications, as well as workbook-to-workbook. Rendering can be by rows or a grid; 2D or 3D; with labels for atom numbers or bond lengths or derived from Calculator Plug-ins (for example partial atomic charges, pKas of specific atoms, atomic contribution to log P). As well as a 2D clean function, JChem for Excel also includes Standardizer, which can be customized. It has a function to show the Bemis-Murcko structure framework as well as the original structure. Reactor functions produce a matrix arrangement of multiple products. Reactivity and selectivity rules are available. Importing from JChem Cartridge or JChem Base databases now supports MySQL. Exporting to files now allows the user to select the structure column to export. This product is in active development with several enhancements planned.

JChem beyond Java · return to TOC
Jonathan Lee of ChemAxon presented the interfaces of the JChem Suite outside of Java, i.e. using web services, .NET, and SQL. The choice of the tool depends on the local environment. A pure .NET solution is available for all non-GUI elements and Marvin, .NET GUI components will be available in Marvin 5.3. This is simpler and faster than the earlier third party JNBridge solution. The .NET interface supports all of the JChem Suite except Cartridge and Marvin Beans classes.

JChem Web Services conform to WS-I, SOAP, and WSDL standards. So far it supports JChem Base searching, Standardization, Chemical Terms evaluation, and molecule conversion. Future enhancements include Reactor, SQL execution, relational table searching, and batch processing. It supports many languages and runs on Windows, Unix, Linux, and Mac OS X.

Tamas Pelcz from ChemAxon described their efforts with SharePoint. ChemAxon’s extensions enable chemistry (render, add, edit, delete, filter) within SharePoint lists, Blogs, Wikis, and discussion boards. These extensions use both applets and the .NET based API. ChemAxon plans to incorporate structure-based predictions, importing and exporting files, charting, Standardizer, structure checker, Reactor, Fragmenter, database searching, searching for structures wherever they are within a document. This is clearly an on-going effort in an exciting new technology.

Command-Line Tools and Pipelining · return to TOC
Gyorgy Pirok of ChemAxon reviewed a set of the command-line tools that are available for all ChemAxon products. They are especially useful when included in scripts for batch processing. Mview starts MarvinView by opening the specified file. The view can be a grid or consist of only a part of a large file. Molconvert, as it name suggests, converts a file from one structure format to another. Convert2image generates a jpeg image from a structure file. Cxcalc runs the specified batch calculations on the input file. Typical options include calculating physical properties, enumerating a Markush structure, calculating the lowest energy conformer, or determining the IUPAC names. Evaluate is a command line interface for complex calculations using the Chemical Terms language, which contains more than a hundred chemical functions. Jcsearch provides simple and complex structure search functions for both files and databases. Standardize converts the input structures according to rules specified by the user. For example, counterions can be removed, nitro groups standardized, alias atoms converted to abbreviated groups, etc. React generates reaction products from reactions, either SMARTS or rxn files or specified on the command line.

Lastly, in his talk Jonathan Lee mentioned that ChemAxon components exist for several workflow software vendors, including Accelrys Pipeline Pilot, Infocom JChem Extensions for KNIME, and Inforsense Analytics. Szilard Dorant from ChemAxon described the components for Pipeline Pilot. The following components are available: Reactor, JChem Base tables, Chemical Terms, LibMCS clustering, Molecule to IUPAC and vice versa, MolConverter, Tautomerization, Markush Enumeration, and the recently added ChemAxon 3D Conformers. Markush Enumeration, MolConverter, Tautomerization and 3D Conformers were added this year. The components have the full functionality of the original, although some (understandably) require an expert user. Support for JChem Cartridge for Oracle, improvements to the Molecular Table Viewer, and integration with Instant JChem are the plans for the near future. As shown in Isabella Haight’s talk, using the components in Pipeline Pilot is a convenient way for a user to integrate ChemAxon tools into their research.

Reactor and Metabolizer · return to TOC
Gyorgy Pirok of ChemAxon presented new capabilities built on Reactor. The Reactor engine is now provided as a module in Instant JChem, thus providing virtual synthesis capability within Instant JChem. This includes multi-molecular reactions with sequential and combinatorial reactant combinations. Output tables can contain products or specific reactions. There are 145 reactions in the library; they are mostly named reactions. A major effort is underway to add several hundred classic preparative reactions for the next release. The plan is to mine reaction databases for this information.

Gyorgy then described Metabolizer, which grew out of the KnowTox project, a joint effort of Aureus Pharma, Sanofi Aventis, ChemAxon and the Department of Chemical Information Technology at Budapest University of Technology and Economics. The project joins information on hepatotoxicity with predictions of metabolic stability to predict hepatotoxicity. ChemAxon has continued the development of Metabolizer by building a library of more than two hundred human Phase I CIPY450 biotransformations with an indication of whether the reaction is fast or slow. The generic transformations are supplemented with Chemical Terms that further define the scope and limitations of the transformation. The transformations are currently being evaluated with literature. The biotransformation library contains example transformations. To predict metabolic stability and the major metabolites each biotransformation is categorized with a base speed.

Isabella Haight from Abbott Laboratories described her use of Reactor in Pipeline Pilot to produce virtual libraries of synthesizable compounds from a well-defined reaction scheme. Simply enumerating all possible products from a given list of starting materials does not consider the possibility that the given reagents would indeed react. The power of the transformations in Reactor lies in the reaction rules for reactivity, selectivity, and exclusion that further refine the scope of the reaction. Isabella illustrated the process with the construction of a CombiChem library that first uses the Bischler-Mohlau Indole synthesis and then attempts to perform a Suzuki Coupling on the products. For example with the reaction rules on the virtual indole synthesis produced 51 products from 10 ketones and 20 amines, whereas without the rules 310 products were produced. She also showed that if the reagents in the Maybridge building block catalogue of 6,458 compounds are pre-filtered so as to contain only the required type of substrate, the virtual indole synthesis took ten minutes, but if reactor is run on the whole catalogue the run took seven hours. Both produced 23,674 products. The advantage of using Reactor as a component in Pipeline Pilot is that one can construct a protocol that filters the reagents, generates the products, applies typical filtering rules such as molecular weight or CLOGP cut-offs, and then clusters the compounds. The compounds are stored in a database with the reaction SMILES, which shows not only the product but the starting materials.

Tobias Kind from the UC Davis Genome Center described cheminformatics approaches for metabolomics research. In metabolomics, the problem is to identify the compounds that appear as peaks in a liquid chromatography-mass spectrometry (LC-MS) or gas chromatography-MS traces. They used ChemAxon tools including Standardizer and Reactor to generate molecular structures and curate mass spectral databases. In addition they have developed a model to predict retention time from molecular properties. By this combined approach they are able to measure more that 150 metabolites in a few micrograms of biological material. Marvin physical property calculations are also used to support structure elucidation from the MS traces.

Clustering · return to TOC
Miklos Vargyas of ChemAxon introduced the new JKlustor Suite. It has a new core infrastructure that provides a generic basis for various clustering approaches, comprehensive API, faster operation, and better support for large data sets (over a million compounds). In development is integration with the JChem Oracle Cartridge, Instant JChem, JChemBase, and enhancements to the Pipeline Pilot components. Future versions will feature a common GUI based on LibMCS that incorporates all methods. Future developments will include disconnected maximum common substructures, k-means clustering, the possibility of a molecule to be in more than one class, and the ability to cluster 10 million compounds.

Oleg Ursu from the University of New Mexico presented their clustering method. The objective is to cluster a database and recognize the Maximum Overlapping Set (MOS) of atoms of the molecules in each cluster. The problem is that MCES (Maximum Common Edge Substructure) clustering using the RASCAL algorithm is computationally too expensive to be used for a large database. To solve this problem, they first cluster using the Taylor algorithm on fingerprints using a high similarity level, which produces many singletons. Then the MCES for each cluster is found and singletons are assigned to one of the existing clusters if possible and matching MCES clusters are merged. Finally, the process is repeated using a lower similarity value. For even larger speed increases, they use an alternative clique detection algorithm for the MCES published by Grosso. The program was tested on release 2009.1 of the WOMBAT database, which contains 242,485 unique structures. Clustering took one week and produced 20,419 clusters and 74,100 singletons. One advantage of the MOS is that it contains the common parts of molecules within a cluster whether these parts are connected or not.

Markush Structures · return to TOC
Szabolcs Csepregi of ChemAxon described the latest development in the ChemAxon handling of Markush structures. He reminded us that Markush structures are used to describe combinatorial libraries and, also, in chemistry-related patents. In ChemAxon software, they are entered by drawing with MarvinSketch, from patent databases (coming in version 5.3), or by reagent clipping or R-group decomposition for a combinatorial library. Once entered, the Markush enumeration plug-in supports full, selected, or random enumeration as well as calculation of library size. Markush structures are stored as such (no enumeration) in JChemBase and Instant JChem. Substructure and full structure searches are supported. The homology (generic) groups in Markush structures may now be described by one or more of 19 built-in groups or by user-defined groups. These are available as templates in Marvin to enable easier sketching. In addition, variable groups may be defined from a list of reagents. The current version also supports position variation and variable numbers of repeating units. The ability to import .VMN (Derwent World Patent Index) files and use all features of this format is under development. Methods to analyze the overlap of Markush structures and to find the maximum common substructure of a Markush structure and a specific molecule are also planned.

Name-to-structure and Structure-to-name · return to TOC
Daniel Bonniot de Ruisselet from ChemAxon presented their work on name to structure and structure to name as well as chemicalize.org. The objective of this work is to generate a chemical structure from a name in a document or web page. The structure to IUPAC name is already available as a plug-in in MarvinView, MarvinSketch and Instant JChem. Recent changes result in support for more names of fused structures; better support for ions, IUPAC numbering and priorities; and removal of over-specific E/Z labels. Name to structure is the focus of much of the current effort. This tool is tested, among other methods, by converting a structure to a name, and then converting the resulting name to a structure. If the program is operating correctly, the original and the final structures should be identical. In version 5.1.0 90% of the names of molecules in the NCI were imported, but only 69% of the test names were identical. This increased in version 5.2.5 to 98% and 96% respectively with a 33% increase in speed. So far they have implemented recognizing names in text, html, and xml with PDF and .doc files planned. New is the capability to recognize errors in optical character recognition. This will also be an option in the name to structure conversions.

Chemicalize.org adds structural information to existing web pages. There is a popup window with the structure image, a link to property predictions, and a searchable structure-to-web page index. It is in the proof-of-concept stage.

Nicholas Goncharoff from SureChem continued the name to structure theme, this time focusing on full-text patent documents. Typically there is only an approximate 60% conversion rate for names in patents. For example in 900 representative patents there are 101,000 names. 59,582 of these can be converted by at least one of the four available tools, and 6128 names are converted by only one tool. (One source of this low conversion rate is in optical character recognition, which often produces typographical errors in the name.) They found that adding the ChemAxon tool, which he said is much enhanced since the June release, increased the conversion rate by 17%.

ChemAxon in 3D · return to TOC
Miklos Vargyas of ChemAxon presented the capabilities of ChemAxon in 3D. The current capabilities include the 3D structure and conformer generators and MarvinSpace, the 3D molecular structure and surface visualizer for small and macromolecules. The Conformers plug-in supports geometry optimization and energy calculation using the Dreiding force-field. The Geometry plug-in calculates steric hindrance, molecular surface area and volume and minimal projected area to aid permeability prediction. Molecular dynamics is available as a plug-in. In development are improvements to molecular dynamics to generate extremely fast conformational transitions by eliminating the vibrational components of dynamics. The 3D structure generator is now one-tenth as fast as Corina for complex structures; however, the structures from ChemAxon do not require optimization. ChemAxon also provides 3D similarity by volume alignment with the ability to also favor aligning atoms of similar properties’ capabilities very similar to the Open Eye product ROCS, with the exception that in the current release of the ChemAxon toolsone can include conformational flexibility of one ooth molecules in the alignment, and more than two molecules can be aligned. The alignments produced by the program compares favorably with other superposition algorithms.

A 3D pharmacophore fingerprint is under development: It will include the minimum and maximum distances for each pair of atoms similar to the fingerprints used by Unity. Virtual screening with it appears to be as fast as 2D similarity searching. It will include Tanimoto, Tversky, Dice, and Euclidean similarity scores. It will remain to be seen if the ChemAxon 3D virtual screening software is competitive with that of other vendors.

Partner Presentations · return to TOC
Nora Lapusnyik and Alex Drijver introduced the partner presentation portion of the meeting by reminding us the ChemAxon supports the freedom of choice in platform and language as well as which capabilities to purchase. Further advantages of ChemAxon are the rapid response to support questions, the reliability of the software, and the many partners who can customize the software.

Ton van Daelen from Accelrys reminded the group about the ChemAxon components in Pipeline Pilot and indicated their expectation of components that will provide additional capabilities.

Michael Burke from Agilent Technologies Inc discussed the integration with ChemAxon’s JChem into the Kalabie electronic notebook.

Chip Allee from CeuticalSoft described their OpenHTSTM software, based on Excel with links to Access and Oracle. It manages most of the data management and analysis needs of high throughput screening. It uses Instant JChem, JChem for Excel, and JChem Base/JChem Cartridge for chemical intelligence.

Barry Prom from Collaborative Drug Discovery provides the infrastructure for scientists to collaborate. It uses Marvin, Calculator Plugins, and the JChem Cartridge. CDD uses cloud based data storage, named CDD vault.

Yvonne Shimsock described DeltaSoft’s commercial software applications based on ChemCart, a configurable forms interface to research data in Oracle. They provide components for all major software vendors’ software and have ChemCart Applications for registration, inventory, electronic laboratory notebook, bioassay, structure activity browser, and custom synthesis tracker.

Steven Muskal from Eidogen-Sertanty presented their Kinase Knowledgebase, KKB, which contains 150,000 molecules with activity data, 500,000 SAR data points, 400 kinase targets. The data will be available in Instant JChem.

Manisha Murthy from Elsevier presented their new product Reaxys. It integrates the data from Beilstein, Gmelin, and Elsevier Patent Chemistry databases. It supports synthesis design; property look-up; url link-in from Excel, for example; and OpenURL. It uses ChemAxon MarvinView for structure display and MarvinSketch as the query editor. It was not clear from the presentation if this will replace the individual databases or just make access to all three easier.

Sufang Zhao represented Founder Software, the leading IT solutions provider in China. The company has now partnered with ChemAxon to bring their software to the Chinese Biotech and Pharmaceutical Industries. Additionally, it is looking for customers in North America.

Takahiro Ohshima from Infocom discussed JChem Extensions for KNIME. Infocom is a Japanese company that offers IT services and solutions. KNIME is the Konstanz Information Miner, a free workflow and data mining platform. The JChem extensions support a dynamic data link, which in turn supports structure search, Standardizer, Calculation plug-in, Markush, and Reaction capabilities within KNIME. Infocom markets a KNIME server suite so that applications are run in parallel on a cluster.

Robert Feinstein from Kelaroo described their “vendor neutral” applications for reagent management and compound registration. They also do custom applications. Although they support all chemical editors, he questioned why anyone would want to use anything other than Marvin!

Derek Hayes from KineMatik discussed their eNovatorTM software. It integrates an electronic laboratory notebook with a knowledge and project management system. Since 2005 they have used ChemAxon’s JChem Cartridge for Oracle and Marvin toolkits within its Electronic Laboratory Notebook and Sample Tracking components.

Jeffrey Nauss from Linguamatics indicated that they integrated ChemAxon to provide chemistry-enabled text mining. The software discovers indirect associations across multiple documents. The program supports structure searching and results visualization as well as structure-to-name using ChemAxon tools.

James Baxendale from Synaptic Science described Seurat (Structure Exploration Utility for RAtional Therapeutics). Seurat integrates diverse databases and tools and supports knowledge capture, access, reporting and collaboration.



Summary · return to TOC
The meeting highlighted the extraordinary talent and hard work of the ChemAxon staff in making their software so powerful and yet so easily integrated into many different applications.

In addition, it provided hints of fundamental changes that will come to end-users of cheminformatics tools. For example with ChemAxon tools embedded in workflow tools, it is much easier for computational chemists and expert users to perform a sequential variety of tasks for some purpose. However, such a workflow can be encapsulated and deployed to bench chemists.

Of course the web has changed the way that many things are done. ChemAxon has not ignored this trend and increased its presence in this area. The URL fields data type in Instant JChem supports accessing web resources? within a desktop application, returning text or an image. Web browsing of the scientific literature could be transformed by functions such as seen in the prototype Chemicalize.org, which finds chemical names within a web document and shows the corresponding structure with a link to property predictions.

ChemAxon’s work to integrate into SharePoint will enable full cheminformatics capabilities within Sharepoint Blogs, Wikis, and discussion boards. This clearly could transform the way that discussions within a team occur.

In short, this was an exciting meeting that showed solid progress, fruitful collaborations, easy integration of ChemAxon software, and suggestions of exciting capabilities in the future.

Return to Table of Contents

Gallery