Genomic prediction using multi-allelic haplotype models improved the prediction accuracy for all seven human phenotypes, the normality transformed high density lipoproteins, low density lipoproteins, total cholesterol, triglycerides, weight, and the original height and body mass index without normality transformation. Eight SNP sets with 40,941-380,705 SNPs were evaluated. The increase in prediction accuracy due to haplotypes was 1.86-8.12%. Haplotypes using fixed chromosome distances had the best prediction accuracy for four phenotypes, fixed number of SNPs for two phenotypes, and gene-based haplotypes for high density lipoproteins and height (tied for best). Haplotypes of coding genes were more accurate than haplotypes of all autosome genes that included both coding and noncoding genes for triglycerides and weight, and nearly the same as haplotypes of all autosome genes for the other phenotypes. Haplotypes of noncoding genes (mostly lncRNAs) only improved the prediction accuracy over the SNP models for high density lipoproteins, total cholesterol, and height. ChIP-seq haplotypes had better prediction accuracy than gene-based haplotypes for total cholesterol, body mass index and low density lipoproteins. The accuracy of ChIP-seq haplotypes was most striking for low density lipoproteins, where all four haplotype models with ChIP-seq haplotypes had similarly high prediction accuracy over the best prediction model with gene-based haplotypes. Haplotype epistasis was shown to be the reason for the increased accuracy due to haplotypes. Low density lipoproteins had the largest haplotype epistasis heritability that explained 14.70% of the phenotypic variance and was 31.27% of the SNP additive heritability, and the largest increase in prediction accuracy relative to the best SNP model (8.12%). Relative to the SNP additive heritability of the same regions, noncoding genes had the highest haplotype epistasis heritability, followed by coding genes and ChIP-seq for the seven phenotypes. SNP and haplotype heritability profiles showed that the integration of SNP and haplotype additive values compensated the weakness of haplotypes in estimating SNP heritabilities for four phenotypes, whereas models with haplotype additive values fully accounted for SNP additive values for three phenotypes. These results showed that haplotype analysis can be a method to utilize functional and structural genomic information to improve the accuracy of genomic prediction.
Bibliographical noteFunding Information:
The Framingham Heart Study is conducted and supported by the National Heart, Lung, and Blood Institute (NHLBI) in collaboration with Boston University. This manuscript was not prepared in collaboration with investigators of the Framingham Heart Study and does not necessarily reflect the opinions or views of the Framingham Heart Study, Boston University, or the NHLBI. Supercomputer computing time and storage for data analysis were provided by the Minnesota Supercomputer Institute at the University of Minnesota.
This research was supported by grants 2018-67015-28128 and 2020-67051-31133 from the USDA National Institute of Food and Agriculture, and by project MIN-16-124 of the Agricultural Experiment Station at the University of Minnesota. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
- coding gene
- genomic prediction
- haplotype epistasis
- human phenotypes
- non-coding gene
PubMed: MeSH publication types
- Journal Article