Motivation: Sequence similarity often suggests evolutionary relationships between protein sequences that can be important for inferring similarity of structure or function. The most widely-used pairwise sequence comparison algorithms for homology detection, such as BLAST and PSI-BLAST, often fail to detect less conserved remotely-related targets. Results: In this paper, we propose a new general graph-based propagation algorithm called MotifProp to detect more subtle similarity relationships than pairwise comparison methods. MotifProp is based on a protein-motif network, in which edges connect proteins and the k-mer based motif features that they contain. We show that our new motif-based propagation algorithm can improve the ranking results over a base algorithm, such as PSI-BLAST, that is used to initialize the ranking. Despite the complex structure of the protein-motif network, MotifProp can be easily interpreted using the top-ranked motifs and motif-rich regions induced by the propagation, both of which are helpful for discovering conserved structural components in remote homologies.
Bibliographical noteFunding Information:
The authors thank Dragomir R. Radev and Lan Xu for helpful discussions. The authors especially thank the computer support staff in the department of Genome Sciences at the University of Washington. This work is supported by NIH grant GM074257-01 and NSF grant ITR-0312706. WSN is an Alfred P. Sloan Research Fellow.
Copyright 2008 Elsevier B.V., All rights reserved.