TY - JOUR
T1 - Artificial neural network-based analysis of high-throughput screening data for improved prediction of active compounds
AU - Chakrabarti, Swapan
AU - Svojanovsky, Stan R.
AU - Slavik, Romana
AU - Georg, Gunda I.
AU - Wilson, George S.
AU - Smith, Peter G.
PY - 2009/12
Y1 - 2009/12
N2 - Artificial neural networks (ANNs) are trained using high-throughput screening (HTS) data to recover active compounds from a large data set. Improved classification performance was obtained on combining predictions made by multiple ANNs. The HTS data, acquired from a methionine aminopeptidases inhibition study, consisted of a library of 43,347 compounds, and the ratio of active to nonactive compounds, RA/N, was 0.0321. Back-propagation ANNs were trained and validated using principal components derived from the physicochemical features of the compounds. On selecting the training parameters carefully, an ANN recovers one-third of all active compounds from the validation set with a 3-fold gain in RA/N value. Further gains in R A/N values were obtained upon combining the predictions made by a number of ANNs. The generalization property of the back-propagation ANNs was used to train those ANNs with the same training samples, after being initialized with different sets of random weights. As a result, only 10% of all available compounds were needed for training and validation, and the rest of the data set was screened with more than a 10-fold gain of the original RA/N value. Thus, ANNs trained with limited HTS data might become useful in recovering active compounds from large data sets.
AB - Artificial neural networks (ANNs) are trained using high-throughput screening (HTS) data to recover active compounds from a large data set. Improved classification performance was obtained on combining predictions made by multiple ANNs. The HTS data, acquired from a methionine aminopeptidases inhibition study, consisted of a library of 43,347 compounds, and the ratio of active to nonactive compounds, RA/N, was 0.0321. Back-propagation ANNs were trained and validated using principal components derived from the physicochemical features of the compounds. On selecting the training parameters carefully, an ANN recovers one-third of all active compounds from the validation set with a 3-fold gain in RA/N value. Further gains in R A/N values were obtained upon combining the predictions made by a number of ANNs. The generalization property of the back-propagation ANNs was used to train those ANNs with the same training samples, after being initialized with different sets of random weights. As a result, only 10% of all available compounds were needed for training and validation, and the rest of the data set was screened with more than a 10-fold gain of the original RA/N value. Thus, ANNs trained with limited HTS data might become useful in recovering active compounds from large data sets.
KW - Generalization property
KW - Neural networks
KW - Pattern classification
UR - http://www.scopus.com/inward/record.url?scp=71949128923&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=71949128923&partnerID=8YFLogxK
U2 - 10.1177/1087057109351312
DO - 10.1177/1087057109351312
M3 - Article
C2 - 19940083
AN - SCOPUS:71949128923
SN - 1087-0571
VL - 14
SP - 1236
EP - 1244
JO - Journal of Biomolecular Screening
JF - Journal of Biomolecular Screening
IS - 10
ER -