Genome-scale data offer the opportunity to clarify phylogenetic relationships that are difficult to resolve with few loci, but they can also identify genomic regions with evolutionary history distinct from that of the species history. We collected whole-genome sequence data from 29 taxa in the legume genus Medicago, then aligned these sequences to the Medicago truncatula reference genome to confidently identify 87 596 variable homologous sites. We used this data set to estimate phylogenetic relationships among Medicago species, to investigate the number of sites needed to provide robust phylogenetic estimates and to identify specific genomic regions supporting topologies in conflict with the genome-wide phylogeny. Our full genomic data set resolves relationships within the genus that were previously intractable. Subsampling the data reveals considerable variation in phylogenetic signal and power in smaller subsets of the data. Even when sampling 5000 sites, no random sample of the data supports a topology identical to that of the genome-wide phylogeny. Phylogenetic relationships estimated from 500-site sliding windows revealed genome regions supporting several alternative species relationships among recently diverged taxa, consistent with the expected effects of deep coalescence or introgression in the recent history of Medicago.
Bibliographical noteFunding Information:
FUNDING The Medicago HapMap Project was funded by the US National Science Foundation (PGRP-0820005) and by the Noble Foundation.
ACKNOWLEDGMENTS The authors thank Roxanne Denny for extensive assistance with data collection, and Keith Barker for invaluable guidance on analysis. Germplasm for the Medicago HapMap Project was provided by the Institut National de la Recherche Agronomique in Montpellier, France; and by the Noble Foundation. Computation resources for sequence alignment, data processing, and phylogenetic analysis were conducted on systems provided by the Minnesota Supercomputing Institute.
- whole-genome resequencing