The 7.4 million plant accessions in gene banks are largely underutilized due to various resource constraints, but current genomic and analytic technologies are enabling us to mine this natural heritage. Here we report a proof-of-concept study to integrate genomic prediction into a broad germplasm evaluation process. First, a set of 962 biomass sorghum accessions were chosen as a reference set by germplasm curators. With high throughput genotyping-by-sequencing (GBS), we genetically characterized this reference set with 340,496 single nucleotide polymorphisms (SNPs). A set of 299 accessions was selected as the training set to represent the overall diversity of the reference set, and we phenotypically characterized the training set for biomass yield and other related traits. Cross-validation with multiple analytical methods using the data of this training set indicated high prediction accuracy for biomass yield. Empirical experiments with a 200-accession validation set chosen from the reference set confirmed high prediction accuracy. The potential to apply the prediction model to broader genetic contexts was also examined with an independent population. Detailed analyses on prediction reliability provided new insights into strategy optimization. The success of this project illustrates that a global, cost-effective strategy may be designed to assess the vast amount of valuable germplasm archived in 1,750 gene banks.
Bibliographical noteFunding Information:
This work was supported by the Agriculture and Food Research Initiative competitive grant (2011-03587) from the USDA National Institute of Food and Agriculture, by the National Science Foundation grant IOS-1238142, by the Kansas State University Center for Sorghum Improvement, by the Iowa State University Raymond F. Baker Center for Plant Breeding and by the Iowa State University Plant Science Institute. We appreciate K. Mayfield, L. Lambright and S. Staggenborg from Chromatin for conducting experiments at Lubbock, Texas.