Multi-assay-based structure-activity relationship models: Improving structure-activity relationship models by incorporating activity information from related targets

Xia Ning; Huzefa Rangwala; George Karypis

doi:10.1021/ci900182q

Multi-assay-based structure-activity relationship models: Improving structure-activity relationship models by incorporating activity information from related targets

Xia Ning, Huzefa Rangwala, George Karypis

Computer Science and Engineering

Research output: Contribution to journal › Article › peer-review

40 Scopus citations

Abstract

Structure-activity relationship (SAR) models are used to inform and to guide the iterative optimization of chemical leads, and they play a fundamental role in modern drug discovery. In this paper, we present a new class of methods for building SAR models, referred to as multi-assay based, that utilize activity information from different targets. These methods first identify a set of targets that are related to the target under consideration, and then they employ various machine learning techniques that utilize activity information from these targets in order to build the desired SAR model. We developed different methods for identifying the set of related targets, which take into account the primary sequence of the targets or the structure of their ligands, and we also developed different machine learning techniques that were derived by using principles of semi-supervised learning, multi-task learning, and classifier ensembles. The comprehensive evaluation of these methods shows that they lead to considerable improvements over the standard SAR models that are based only on the ligands of the target under consideration. On a set of 117 protein targets, obtained from PubChem, these multi-assay-based methods achieve a receiver-operating characteristic score that is, on the average, 7.0 -7.2% higher than that achieved by the standard SAR models. Moreover, on a set of targets belonging to six protein families, the multi-assay-based methods outperform chemogenomicsbased approaches by 4.33%.

Original language	English (US)
Pages (from-to)	2444-2456
Number of pages	13
Journal	Journal of Chemical Information and Modeling
Volume	49
Issue number	11
DOIs	https://doi.org/10.1021/ci900182q
State	Published - Nov 23 2009

Access

10.1021/ci900182q

OpenUrl availability

Full text

Cite this

Multi-assay-based structure-activity relationship models: Improving structure-activity relationship models by incorporating activity information from related targets. / Ning, Xia; Rangwala, Huzefa; Karypis, George.
In: Journal of Chemical Information and Modeling, Vol. 49, No. 11, 23.11.2009, p. 2444-2456.

Research output: Contribution to journal › Article › peer-review

@article{99a6bc5ec36c4f928ecf89e284fa4b61,

title = "Multi-assay-based structure-activity relationship models: Improving structure-activity relationship models by incorporating activity information from related targets",

abstract = "Structure-activity relationship (SAR) models are used to inform and to guide the iterative optimization of chemical leads, and they play a fundamental role in modern drug discovery. In this paper, we present a new class of methods for building SAR models, referred to as multi-assay based, that utilize activity information from different targets. These methods first identify a set of targets that are related to the target under consideration, and then they employ various machine learning techniques that utilize activity information from these targets in order to build the desired SAR model. We developed different methods for identifying the set of related targets, which take into account the primary sequence of the targets or the structure of their ligands, and we also developed different machine learning techniques that were derived by using principles of semi-supervised learning, multi-task learning, and classifier ensembles. The comprehensive evaluation of these methods shows that they lead to considerable improvements over the standard SAR models that are based only on the ligands of the target under consideration. On a set of 117 protein targets, obtained from PubChem, these multi-assay-based methods achieve a receiver-operating characteristic score that is, on the average, 7.0 -7.2% higher than that achieved by the standard SAR models. Moreover, on a set of targets belonging to six protein families, the multi-assay-based methods outperform chemogenomicsbased approaches by 4.33%.",

author = "Xia Ning and Huzefa Rangwala and George Karypis",

year = "2009",

month = nov,

day = "23",

doi = "10.1021/ci900182q",

language = "English (US)",

volume = "49",

pages = "2444--2456",

journal = "Journal of Chemical Information and Modeling",

issn = "1549-9596",

publisher = "American Chemical Society",

number = "11",

}

TY - JOUR

T1 - Multi-assay-based structure-activity relationship models

T2 - Improving structure-activity relationship models by incorporating activity information from related targets

AU - Ning, Xia

AU - Rangwala, Huzefa

AU - Karypis, George

PY - 2009/11/23

Y1 - 2009/11/23

N2 - Structure-activity relationship (SAR) models are used to inform and to guide the iterative optimization of chemical leads, and they play a fundamental role in modern drug discovery. In this paper, we present a new class of methods for building SAR models, referred to as multi-assay based, that utilize activity information from different targets. These methods first identify a set of targets that are related to the target under consideration, and then they employ various machine learning techniques that utilize activity information from these targets in order to build the desired SAR model. We developed different methods for identifying the set of related targets, which take into account the primary sequence of the targets or the structure of their ligands, and we also developed different machine learning techniques that were derived by using principles of semi-supervised learning, multi-task learning, and classifier ensembles. The comprehensive evaluation of these methods shows that they lead to considerable improvements over the standard SAR models that are based only on the ligands of the target under consideration. On a set of 117 protein targets, obtained from PubChem, these multi-assay-based methods achieve a receiver-operating characteristic score that is, on the average, 7.0 -7.2% higher than that achieved by the standard SAR models. Moreover, on a set of targets belonging to six protein families, the multi-assay-based methods outperform chemogenomicsbased approaches by 4.33%.

AB - Structure-activity relationship (SAR) models are used to inform and to guide the iterative optimization of chemical leads, and they play a fundamental role in modern drug discovery. In this paper, we present a new class of methods for building SAR models, referred to as multi-assay based, that utilize activity information from different targets. These methods first identify a set of targets that are related to the target under consideration, and then they employ various machine learning techniques that utilize activity information from these targets in order to build the desired SAR model. We developed different methods for identifying the set of related targets, which take into account the primary sequence of the targets or the structure of their ligands, and we also developed different machine learning techniques that were derived by using principles of semi-supervised learning, multi-task learning, and classifier ensembles. The comprehensive evaluation of these methods shows that they lead to considerable improvements over the standard SAR models that are based only on the ligands of the target under consideration. On a set of 117 protein targets, obtained from PubChem, these multi-assay-based methods achieve a receiver-operating characteristic score that is, on the average, 7.0 -7.2% higher than that achieved by the standard SAR models. Moreover, on a set of targets belonging to six protein families, the multi-assay-based methods outperform chemogenomicsbased approaches by 4.33%.

UR - http://www.scopus.com/inward/record.url?scp=72949114936&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=72949114936&partnerID=8YFLogxK

U2 - 10.1021/ci900182q

DO - 10.1021/ci900182q

M3 - Article

C2 - 19842624

AN - SCOPUS:72949114936

SN - 1549-9596

VL - 49

SP - 2444

EP - 2456

JO - Journal of Chemical Information and Modeling

JF - Journal of Chemical Information and Modeling

IS - 11

ER -

Multi-assay-based structure-activity relationship models: Improving structure-activity relationship models by incorporating activity information from related targets

Abstract

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this