Does a 3D shape based method diﬀer from a molecular descriptor based similarity in its ability to predict biological activity?
Similarity of molecular structures is data-set dependent. The DTP’s NC60 panel is a collection of 60 cancerous cell lines maintained by the National Cancer Institute (NCI). Since 1990 tens of thousands of chemical compounds and natural products have been screened against these cells providing a vast repository of molecules for which both toxicity data and structural information are available. In our work, pairwise structural and biological activity similarities were calculated on a set of selected DTP agents to quantify how well structure similarity predicts related patterns in the toxicity data. In total, we found that the capacity of all tested structural metrics are comparable in terms of indicating biological activity similarity.
However, the set of agent pairs displaying both structural and biological activity similarities are slightly distinct in the case of the different metrics. The group of compound pairs exhibiting 3D shape and biological activity but not molecular descriptor based similarities are of special interest: among them, scaffold hopping molecule pairs are expected. The toxicity patterns of these molecule pairs are analogous by definition, they also display analogous 3D shapes as ensured by the 3D shape based similarity, but their core structures are likely to be different because of the dissimilarity in the molecular descriptor based metric. Hence they are likely to exhibit scaffold hopping, an approach defined by similar biological activities of different molecular backbones. Scaffold hopping candidates around FDA approved DTP agents were collected, since a scaffold hopping analogue of an active compound may exhibit better physicochemical and pharmacokinetic properties while retaining the original potency thus providing a new direction for further optimization.