Abstract
The large numbers of markers in high-resolution radiation hybrid (RH) maps, increasingly necessitates the use of data mining techniques for reducing both the computational com- plexity and the impact of noise of the original data. Tradi- Tionally, the RH mapping process has been treated as equiva- lent to the traveling salesman problem, with the correspond- ingly high computational complexity. These techniques are also susceptible to noise, and unreliable marker can result in major disruptions of the overall order. In this paper, we propose a new approach that recognizes that the focus on nearest-neighbor distances that characterizes the traveling- salesman model, is no longer appropriate for the large num- ber of markers in modern high-resolution mapping experi- ments. The proposed approach splits the mapping process into two levels, where the higher level only operates on the most stable markers of the lower level. A divide and conquer strategy, which is applied at the lower level, removes much of the impact of noise. Because of the high density of mark- ers, only the most stable representatives from the lower level are then used at the higher level. The groupings within the lower level are so small that exhaustive search can be used. Markers are then mapped iteratively, while excluding prob- lematic markers. The results for RH mapping dataset of the human genome show that the proposed approach can con- struct high-resolution maps with high agreement with the physical maps in a comparatively very short time.
Original language | English (US) |
---|---|
Title of host publication | ACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics |
Publisher | Association for Computing Machinery |
Pages | 541-550 |
Number of pages | 10 |
ISBN (Electronic) | 9781450328944 |
DOIs | |
State | Published - Sep 20 2014 |
Event | 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, ACM BCB 2014 - Newport Beach, United States Duration: Sep 20 2014 → Sep 23 2014 |
Publication series
Name | ACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics |
---|
Other
Other | 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, ACM BCB 2014 |
---|---|
Country/Territory | United States |
City | Newport Beach |
Period | 9/20/14 → 9/23/14 |
Bibliographical note
Publisher Copyright:Copyright © 2014 ACM.
Keywords
- Bioin- formatics
- Clustering
- Data mining
- High-resolution maps
- Noisy datasets
- Radiation hybrid mapping