ChemModLab: A web-based cheminformatics modeling laboratory

Jacqueline M. Hughes-Oliver, Atina D. Brooks, William J. Welch, Morteza G. Khaledi, Douglas Hawkins, S. Stanley Young, Kirtesh Patil, Gary W. Howell, Raymond T. Ng, Moody T. Chu

Research output: Contribution to journalArticlepeer-review

8 Scopus citations

Abstract

ChemModLab, written by the ECCR @ NCSU consortium under NIH support, is a toolbox for fitting and assessing quantitative structure-activity relationships (QSARs). Its elements are: a cheminformatic front end used to supply molecular descriptors for use in modeling; a set of methods for fitting models; and methods for validating the resulting model. Compounds may be input as structures from which standard descriptors will be calculated using the freely available cheminformatic front end PowerMV; PowerMV also supports compound visualization. In addition, the user can directly input their own choices of descriptors, so the capability for comparing descriptors is effectively unlimited. The statistical methodologies comprise a comprehensive collection of approaches whose validity and utility have been accepted by experts in the fields. As far as possible, these tools are implemented in open-source software linked into the flexible R platform, giving the user the capability of applying many different QSAR modeling methods in a seamless way. As promising new QSAR methodologies emerge from the statistical and data-mining communities, they will be incorporated in the laboratory. The web site also incorporates links to public-domain data sets that can be used as test cases for proposed new modeling methods. The capabilities of ChemModLab are illustrated using a variety of biological responses, with different modeling methodologies being applied to each. These show clear differences in quality of the fitted QSAR model, and in computational requirements. The laboratory is web-based, and use is free. Researchers with new assay data, a new descriptor set, or a new modeling method may readily build QSAR models and benchmark their results against other findings. Users may also examine the diversity of the molecules identified by a QSAR model. Moreover, users have the choice of placing their data sets in a public area to facilitate communication with other researchers; or can keep them hidden to preserve confidentiality.

Original languageEnglish (US)
Pages (from-to)61-81
Number of pages21
JournalIn Silico Biology
Volume11
Issue number1-2
DOIs
StatePublished - 2011

Keywords

  • Cheminformatics
  • QSAR
  • data-mining
  • ensemble methods
  • model assessment
  • model validation
  • nearest neighbors
  • neural networks
  • recursive partitioning
  • regression
  • support vector machine
  • virtual screening

Fingerprint

Dive into the research topics of 'ChemModLab: A web-based cheminformatics modeling laboratory'. Together they form a unique fingerprint.

Cite this