LIBRUS: Combined machine learning and homology information for sequence-based ligand-binding residue prediction

Research output: Contribution to journalArticlepeer-review

15 Scopus citations


Motivation: Identifying residues that interact with ligands is useful as a first step to understanding protein function and as an aid to designing small molecules that target the protein for interaction. Several studies have shown that sequence features are very informative for this type of prediction, while structure features have also been useful when structure is available. We develop a sequence-based method, called LIBRUS, that combines homology-based transfer and direct prediction using machine learning and compare it to previous sequence-based work and current structure-based methods. Results: Our analysis shows that homology-based transfer is slightly more discriminating than a support vector machine learner using profiles and predicted secondary structure. We combine these two approaches in a method called LIBRUS. On a benchmark of 885 sequence-independent proteins, it achieves an area under the ROC curve (ROC) of 0.83 with 45% precision at 50% recall, a significant improvement over previous sequence-based efforts. On an independent benchmark set, a current method, FINDSITE, based on structure features achieves an ROC of 0.81 with 54% precision at 50% recall, while LIBRUS achieves an ROC of 0.82 with 39% precision at 50% recall at a smaller computational cost. When LIBRUS and FINDSITE predictions are combined, performance is increased beyond either reaching an ROC of 0.86 and 59% precision at 50% recall.

Original languageEnglish (US)
Article numberbtp561
Pages (from-to)3099-3107
Number of pages9
Issue number23
StatePublished - Sep 28 2009

Bibliographical note

Funding Information:
Funding: National Institute of Health (T32GM008347, RLM008713A); the National Science Foundation (IIS-0431135, IIS-0905220); the University of Minnesota Digital Technology Center.

Fingerprint Dive into the research topics of 'LIBRUS: Combined machine learning and homology information for sequence-based ligand-binding residue prediction'. Together they form a unique fingerprint.

Cite this