We are building a framework physical infrastructure across the soybean genome by using SSR (simple sequence repeat) and RFLP (restriction fragment length polymorphism) markers to identify BACs (bacterial artificial chromosomes) from two soybean BAC libraries. The libraries were prepared from two genotypes, each digested with a different restriction enzyme. The BACs identified by each marker were grouped into contigs. We have obtained BAC-end sequence from BACs within each contig. The sequences were analyzed by the University of Minnesota Center for Computational Genomics and Bioinformatics using BLAST algorithms to search nucleotide and protein databases. The SSR-identified BACs had a higher percentage of significant BLAST hits than did the RFLP-identified BACs. This difference was due to a higher percentage of hits to repetitive-type sequences for the SSR-identified BACs that was offset in part, however, by a somewhat larger proportion of RFLP-identified significant hits with similarity to experimentally defined genes and soybean ESTs (expressed sequence tags). These genes represented a wide range of metabolic functions. In these analyses, only repetitive sequences from SSR-identified contigs appeared to be clustered. The BAC-end sequences also allowed us to identify microsynteny between soybean and the model plants Arabidopsis thaliana and Medicago truncatula. This map-based approach to genome sampling provides a means of assaying soybean genome structure and organization.
- Glycine max
- Physical map