Toward the Development of an Open Source Cheminformatics Platform for HTS Data Analysis and Visualization

The identification of promising chemical series from high through-put screening (HTS) data is an integral part of a therapeutic project, one which ultimately can determine whether the project will be successful. Despite its importance in deciding how to invest resources during the discovery process, analyzing HTS data is often performed in an ad-hoc manner using various combination of commercial and in-house tools. Furthermore, with the proliferation of available HTS data in the public domain (e.g., PubChem), the need for a widely available, easy-to-use tool to analyze and visualize HTS data has never been greater. To this end, we describe NCGC’s on-going effort to develop an open cheminformatics platform to address this need. The current platform embodies our collective experience in analyzing and visualizing HTS data across a large number of assays (>200) over the years. The JChem library, known for its maturity and robustness, has been instrumental in allowing us to develop a fairly feature-complete prototype within a short period of time. In this talk, we highlight some key features (e.g., standardization, data visualization, fragment-based automated analysis) of the platform and discuss its future directions.