A simple convolutional neural network for prediction of enhancer-promoter interactions with DNA sequence data

Zhong Zhuang, Xiaotong Shen, Wei Pan

Research output: Contribution to journalArticlepeer-review

45 Scopus citations

Abstract

Motivation: Enhancer-promoter interactions (EPIs) in the genome play an important role in transcriptional regulation. EPIs can be useful in boosting statistical power and enhancing mechanistic interpretation for disease-or trait-associated genetic variants in genome-wide association studies. Instead of expensive and time-consuming biological experiments, computational prediction of EPIs with DNA sequence and other genomic data is a fast and viable alternative. In particular, deep learning and other machine learning methods have been demonstrated with promising performance. Results: First, using a published human cell line dataset, we demonstrate that a simple convolutional neural network (CNN) performs as well as, if no better than, a more complicated and state-of-the-art architecture, a hybrid of a CNN and a recurrent neural network. More importantly, in spite of the well-known cell line-specific EPIs (and corresponding gene expression), in contrast to the standard practice of training and predicting for each cell line separately, we propose two transfer learning approaches to training a model using all cell lines to various extents, leading to substantially improved predictive performance.

Original languageEnglish (US)
Pages (from-to)2899-2906
Number of pages8
JournalBioinformatics
Volume35
Issue number17
DOIs
StatePublished - Sep 1 2019

Bibliographical note

Funding Information:
The authors thank the reviewers, Mengli Xiao and Chong Wu for helpful comments. This research was supported by NIH grants R01GM126002, R01HL116720, R01GM113250 and R01HL105397, and by the Minnesota Supercomputing Institute.

Publisher Copyright:
© 2019 The Author(s).

Fingerprint

Dive into the research topics of 'A simple convolutional neural network for prediction of enhancer-promoter interactions with DNA sequence data'. Together they form a unique fingerprint.

Cite this