Similarity implicated exploration of the fragment galaxy
Fragment space is a marginal subgroup of the druggable universe of chemical entities considering its population. However, it spans astonishingly high number of structures. GDB-13, the largest publicly available virtually enumerated collection counts nearly 1 billion structures of fragment size compounds. Chemically intelligent navigation in this vast dataset demands special purpose solutions. Our study focuses on making the very large chemical datasets live by ultra fast similarity search method. As a use-case, we search GDB-13 to ﬁnd similar structures to that of the FDA approved drugs not exempliﬁed in the space of patented structures available within SureChEMBL. This framework represents a scaffold hopping approach exploiting the GDB-13 under the hood and demonstrates the beneﬁt of using MadFast SimilaritySearch technique.