Quantitative structure-activity relationships (QSARs) are broadly classified as global or local, depending on their molecular constitution. Global models use large and diverse training sets covering a wide range of chemical space. Local models focus on smaller structurally or chemically similar subsets that are conventionally selected by human experts or alternatively using clustering analysis. The current study focuses on the comparative analysis of different clustering algorithms (expectation-maximization, K-means and hierarchical) for seven different descriptor sets as structural characteristics and two rule-based approaches to select subsets for designing local QSAR models. A total of 111 local QSAR models are developed for predicting bioconcentration factor. Predictions from local models were compared with corresponding predictions from the global model. The comparison of coefficients of determination (r 2) and standard deviations for local models with similar subsets from the global model show improved prediction quality in 97% of cases. The descriptor content of derived QSARs is discussed and analyzed. Local QSAR models were further consolidated within the framework of consensus approach. All different consensus approaches increased performance over the global and local models. The consensus approach reduced the number of strongly deviating predictions by evening out prediction errors, which were produced by some local QSARs.
Posts tegged as 'QSAR'
Comparative analysis of local and consensus quantitative structure-activity relationship approaches for the prediction of biocon
2-aminothiazoles is a class of compounds capable of treating life-threatening prion diseases. QSAR studies on a set of forty-seven 2-aminothiazole derivatives possessing anti-prion activi- ty were performed using multivariate analysis, which comprised of multiple linear regression (MLR), artificial neural network (ANN) and support vector machine (SVM). The results indi- cated that MLR afforded reasonable performance with a correlation coefficient (r) and root mean squared error (RMSE) of 0.9073 and 0.2977, respectively, as obtained from leave-one- out cross-validation (LOO-CV). More sophisticated learning methods such as SVM provided models with the highest accuracy with r and RMSE of 0.9471 and 0.2264, respectively, while ANN gave reasonable performance with r and RMSE of 0.9023 and 0.3043, respectively, as obtained LOO-CV calculations. Descriptor analysis from the regression coefficients of the MLR model suggested that compounds should be asymmetrical molecule with low propensity to form hydrogen bonds and high frequency of N content at topological distance 02 in order to provide good activities. Insights from QSAR studies is anticipated to be useful in the design of novel derivatives based on the 2-aminothiazole scaffold as potent therapeutic agents against prion diseases.