Neural network analysis of protein tertiary structure

George L. Wilcox, Marius Poliac, Michael N. Liebman

Research output: Contribution to journalArticlepeer-review

16 Scopus citations

Abstract

We describe a large scale application of a back-propagation neural network to the analysis, classification and prediction of protein secondary and tertiary structure from sequence information alone. A back-propagation network called BigNet has been implemented along with a Network Description Language (NDL) on the 512 MWord Cray 2 at the Minnesota Supercomputer Center. The proof-of-concept experiments described here used a small, heterologous training set of small protein structures (15 proteins each with less than 133 residues) from the Brookhaven Protein Data Bank (PDB). Simulations with one hidden layer and one half to ten million connections execute at three to five million connection updates per second in full back-propagation learning mode and routinely converge to solutions where input of hydrophobicity-coded sequence yields output distance matrices with 0.3 to 1.5% RMS deviation from actual distance matrices. Although the training set used is too small to expect useful generalization, some evidence of generalization was evident in similarity of learning progress of homologous pairs within the training set and in production of novel distance matrix outputs upon presentation with novel input sequences. The discussion addresses limitations in the current implementation, plans for software improvements, and characteristics of future training sets.

Original languageEnglish (US)
Pages (from-to)191-204,IN4,205-211
JournalTetrahedron Computer Methodology
Volume3
Issue number3-4
DOIs
StatePublished - 1990

Bibliographical note

Funding Information:
The authors gratefully acknowledge the expert technical assistance of Yiyi Xin and Tidhar Carmeli, who contributed substantially to the conduct of the experiments described here; we also thank Joseph Habermann of the Minnesota Supercomputer Institute (MSI) and Bill King of the Minnesota Supercomputer Center, Inc. for their help in development of graphic display programs. We also acknowledge the following organizations for support of this research: MSI and Cray Research Inc. provided supercomputer access to GLW, and MSI partially supports MOP and YX; the Army High Performance Computing Research Center (AHPCRC) partially supports YX, and the National Institutes of Health (NIH grant R03-RR-05294) partially supported GLW, MOP and TC.

Keywords

  • Back-propagation
  • BigNet
  • Conformation prediction
  • Distance Matrix
  • Hydrophobicity coding

Fingerprint

Dive into the research topics of 'Neural network analysis of protein tertiary structure'. Together they form a unique fingerprint.

Cite this