Contact map prediction is of great interest for its application in fold recognition and protein 3D structure determination. In this paper we present a contact-map prediction algorithm that employs Support Vector Machines as the machine learning tool and incorporates various features such as sequence profiles and their conservation, correlated mutation analysis based on various amino acid physicochemical properties,and secondary structure. In addition, we evaluated the effectiveness of the different features on contact map prediction for different fold classes. On average, our predictor achieved a prediction accuracy of 0.2238 with an improvement over a random predictor of a factor 11.7, which is better than reported studies. Our study showed that predicted secondary structure features play an important roles for the proteins containing beta structures. Models based on secondary structure features and CMA features produce different sets of predictions. Our study also suggests that models learned separately for different protein fold families may achieve better performance than a unified model.
|Original language||English (US)|
|Title of host publication||Proceedings - 3rd IEEE Symposium on BioInformatics and BioEngineering, BIBE 2003|
|Publisher||Institute of Electrical and Electronics Engineers Inc.|
|Number of pages||8|
|ISBN (Electronic)||0769519075, 9780769519074|
|State||Published - 2003|
|Event||3rd IEEE Symposium on BioInformatics and BioEngineering, BIBE 2003 - Bethesda, United States|
Duration: Mar 10 2003 → Mar 12 2003
|Name||Proceedings - 3rd IEEE Symposium on BioInformatics and BioEngineering, BIBE 2003|
|Other||3rd IEEE Symposium on BioInformatics and BioEngineering, BIBE 2003|
|Period||3/10/03 → 3/12/03|
Bibliographical notePublisher Copyright:
© 2003 IEEE.