Optimal neighbor selection in molecular similarity: Comparison of arbitrary versus tailored prediction spaces

Brian D Gute, Subhash C Basak

Research output: Contribution to journalArticlepeer-review

9 Scopus citations

Abstract

Three classes of arbitrary quantitative molecular similarity analysis (QMSA) methods have been computed using atom pairs (APs), topological indices (TIs), and principal components (PCs) derived from topological indices. Tailored QMSA models have been developed from TIs selected through ridge regression. K-nearest neighbor (kNN) based estimation has been applied to all of the methods to estimate normal vapor pressure (pvap) and water solubility (sol) for a set of 194 chemicals. Results show that the tailored QMSA methods are superior to arbitrary similarity methods in estimating both of these properties for the given set of chemicals.

Original languageEnglish (US)
Pages (from-to)37-51
Number of pages15
JournalSAR and QSAR in environmental research
Volume17
Issue number1
DOIs
StatePublished - Feb 2006

Bibliographical note

Funding Information:
This manuscript is contribution number 396 from the Center for Water and the Environment of the Natural Resources Research Institute. This material is based on research sponsored by the Air Force Research Laboratory, under agreement number F49620-02-1-0138. The US Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright notation thereon.

Keywords

  • Atom pairs
  • Quantitative molecular similarity analysis (QMSA)
  • Tailored QMSA
  • Topological indices
  • kNN

Fingerprint

Dive into the research topics of 'Optimal neighbor selection in molecular similarity: Comparison of arbitrary versus tailored prediction spaces'. Together they form a unique fingerprint.

Cite this