Abstract
Variation in the Human Leukocyte Antigen (HLA) gene system is very important. It is one of the most polymorphic regions of the human genome and one of the most extensively studied regions due to its association with autoimmune, infectious, and inflammatory diseases, such as rheumatoid arthritis, celiac disease, multiple sclerosis and Type I diabetes. The HLA gene system also plays a crucial role in hematopoietic stem cell transplantation, where patients and donors are matched with respect to their HLA genes in order to maximize the chances of a successful transplant. Having complete HLA data is therefore of great use to clinicians and researchers. However, due to its polymorphism, obtaining it is highly time- and cost-prohibitive. Genome-wide association studies finding strong associations within HLA region would ideally like to identify the exact HLA alleles responsible for association in order to determine the causal genes/variants. Here we propose a method to infer HLA alleles from widely available and affordable SNP genotype data. Our method takes into account the high linkage disequilibrium that exists in the region. We demonstrate that this additional information is an imporant asset in HLA prediction problem.
Original language | English (US) |
---|---|
Title of host publication | Proceedings - 12th IEEE International Conference on Data Mining Workshops, ICDMW 2012 |
Pages | 964-971 |
Number of pages | 8 |
DOIs | |
State | Published - Dec 1 2012 |
Event | 12th IEEE International Conference on Data Mining Workshops, ICDMW 2012 - Brussels, Belgium Duration: Dec 10 2012 → Dec 10 2012 |
Other
Other | 12th IEEE International Conference on Data Mining Workshops, ICDMW 2012 |
---|---|
Country | Belgium |
City | Brussels |
Period | 12/10/12 → 12/10/12 |
Keywords
- HLA imputation
- Human Leukocyte Antigen
- Multi-label prediction
- SNP data