A simple convolutional neural network for prediction of enhancer-promoter interactions with DNA sequence data

Zhong Zhuang; Xiaotong Shen; Wei Pan

doi:10.1093/bioinformatics/bty1050

A simple convolutional neural network for prediction of enhancer-promoter interactions with DNA sequence data

Zhong Zhuang, Xiaotong Shen, Wei Pan

Research output: Contribution to journal › Article › peer-review

45 Scopus citations

Abstract

Motivation: Enhancer-promoter interactions (EPIs) in the genome play an important role in transcriptional regulation. EPIs can be useful in boosting statistical power and enhancing mechanistic interpretation for disease-or trait-associated genetic variants in genome-wide association studies. Instead of expensive and time-consuming biological experiments, computational prediction of EPIs with DNA sequence and other genomic data is a fast and viable alternative. In particular, deep learning and other machine learning methods have been demonstrated with promising performance. Results: First, using a published human cell line dataset, we demonstrate that a simple convolutional neural network (CNN) performs as well as, if no better than, a more complicated and state-of-the-art architecture, a hybrid of a CNN and a recurrent neural network. More importantly, in spite of the well-known cell line-specific EPIs (and corresponding gene expression), in contrast to the standard practice of training and predicting for each cell line separately, we propose two transfer learning approaches to training a model using all cell lines to various extents, leading to substantially improved predictive performance.

Original language	English (US)
Pages (from-to)	2899-2906
Number of pages	8
Journal	Bioinformatics
Volume	35
Issue number	17
DOIs	https://doi.org/10.1093/bioinformatics/bty1050
State	Published - Sep 1 2019

Bibliographical note

Funding Information:
The authors thank the reviewers, Mengli Xiao and Chong Wu for helpful comments. This research was supported by NIH grants R01GM126002, R01HL116720, R01GM113250 and R01HL105397, and by the Minnesota Supercomputing Institute.

Publisher Copyright:
© 2019 The Author(s).

Access

10.1093/bioinformatics/bty1050

OpenUrl availability

Full text

Cite this

@article{833f2dfea8114896b7432cde1275dacf,

title = "A simple convolutional neural network for prediction of enhancer-promoter interactions with DNA sequence data",

abstract = "Motivation: Enhancer-promoter interactions (EPIs) in the genome play an important role in transcriptional regulation. EPIs can be useful in boosting statistical power and enhancing mechanistic interpretation for disease-or trait-associated genetic variants in genome-wide association studies. Instead of expensive and time-consuming biological experiments, computational prediction of EPIs with DNA sequence and other genomic data is a fast and viable alternative. In particular, deep learning and other machine learning methods have been demonstrated with promising performance. Results: First, using a published human cell line dataset, we demonstrate that a simple convolutional neural network (CNN) performs as well as, if no better than, a more complicated and state-of-the-art architecture, a hybrid of a CNN and a recurrent neural network. More importantly, in spite of the well-known cell line-specific EPIs (and corresponding gene expression), in contrast to the standard practice of training and predicting for each cell line separately, we propose two transfer learning approaches to training a model using all cell lines to various extents, leading to substantially improved predictive performance.",

author = "Zhong Zhuang and Xiaotong Shen and Wei Pan",

note = "Funding Information: The authors thank the reviewers, Mengli Xiao and Chong Wu for helpful comments. This research was supported by NIH grants R01GM126002, R01HL116720, R01GM113250 and R01HL105397, and by the Minnesota Supercomputing Institute. Publisher Copyright: {\textcopyright} 2019 The Author(s).",

year = "2019",

month = sep,

day = "1",

doi = "10.1093/bioinformatics/bty1050",

language = "English (US)",

volume = "35",

pages = "2899--2906",

journal = "Bioinformatics",

issn = "1367-4803",

publisher = "Oxford University Press",

number = "17",

}

TY - JOUR

T1 - A simple convolutional neural network for prediction of enhancer-promoter interactions with DNA sequence data

AU - Zhuang, Zhong

AU - Shen, Xiaotong

AU - Pan, Wei

N1 - Funding Information: The authors thank the reviewers, Mengli Xiao and Chong Wu for helpful comments. This research was supported by NIH grants R01GM126002, R01HL116720, R01GM113250 and R01HL105397, and by the Minnesota Supercomputing Institute. Publisher Copyright: © 2019 The Author(s).

PY - 2019/9/1

Y1 - 2019/9/1

N2 - Motivation: Enhancer-promoter interactions (EPIs) in the genome play an important role in transcriptional regulation. EPIs can be useful in boosting statistical power and enhancing mechanistic interpretation for disease-or trait-associated genetic variants in genome-wide association studies. Instead of expensive and time-consuming biological experiments, computational prediction of EPIs with DNA sequence and other genomic data is a fast and viable alternative. In particular, deep learning and other machine learning methods have been demonstrated with promising performance. Results: First, using a published human cell line dataset, we demonstrate that a simple convolutional neural network (CNN) performs as well as, if no better than, a more complicated and state-of-the-art architecture, a hybrid of a CNN and a recurrent neural network. More importantly, in spite of the well-known cell line-specific EPIs (and corresponding gene expression), in contrast to the standard practice of training and predicting for each cell line separately, we propose two transfer learning approaches to training a model using all cell lines to various extents, leading to substantially improved predictive performance.

AB - Motivation: Enhancer-promoter interactions (EPIs) in the genome play an important role in transcriptional regulation. EPIs can be useful in boosting statistical power and enhancing mechanistic interpretation for disease-or trait-associated genetic variants in genome-wide association studies. Instead of expensive and time-consuming biological experiments, computational prediction of EPIs with DNA sequence and other genomic data is a fast and viable alternative. In particular, deep learning and other machine learning methods have been demonstrated with promising performance. Results: First, using a published human cell line dataset, we demonstrate that a simple convolutional neural network (CNN) performs as well as, if no better than, a more complicated and state-of-the-art architecture, a hybrid of a CNN and a recurrent neural network. More importantly, in spite of the well-known cell line-specific EPIs (and corresponding gene expression), in contrast to the standard practice of training and predicting for each cell line separately, we propose two transfer learning approaches to training a model using all cell lines to various extents, leading to substantially improved predictive performance.

UR - http://www.scopus.com/inward/record.url?scp=85071517568&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85071517568&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/bty1050

DO - 10.1093/bioinformatics/bty1050

M3 - Article

C2 - 30649185

AN - SCOPUS:85071517568

SN - 1367-4803

VL - 35

SP - 2899

EP - 2906

JO - Bioinformatics

JF - Bioinformatics

IS - 17

ER -

A simple convolutional neural network for prediction of enhancer-promoter interactions with DNA sequence data

Abstract

Bibliographical note

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this