Diversity, Distribution, and Ancient Taxonomic Relationships within the TIR and non-TIR NBS-LRR Resistance Gene Subfamilies
Steven B. Cannon1, Hongyan Zhu4, Andrew M. Baumgarten1, Russell Spangler3, Georgiana May1,3, Douglas R. Cook5, Nevin D. Young1,2
1Department of Plant Biology, University of Minnesota 2Department of Plant Pathology, University of Minnesota 3Department of Department of Ecology, Evolution, and Behavior, University of Minnesota 4Graduate Program in Genetics, Texas A&M University 5Department of Plant Pathology, University of California, Davis
Abstract
Phylogenetic relationships among the NBS-LRR (nucleotide binding siteleucine rich repeat) resistance gene homologs (RGHs) from 30 genera and 9 families were evaluated relative to phylogenies for these taxa. More than 800 NBS-LRR RGHs were analyzed, primarily from species in Fabaceae, Brassicaceae, Poaceae, and Solanaceae, but also from representatives of other angiosperm and gymnosperm families. Parsimony, maximum likelihood, and distance methods were used to classify these RGHs relative to previously observed gene subfamilies as well as within more closely related sequence clades. Grouping sequences using a distance cutoff of 250 PAM units (point accepted mutations per 100 residues) identified at least five ancient sequence clades with representatives from several plant families: the previously observed TIR gene subfamily, and a minimum of four deep splits within the non-TIR gene subfamily. The deep splits in the non-TIR subfamily are also reflected in comparisons of amino acid substitution rates in various species, and in ratios of nonsynonymous to synonymous nucleotide substitution rates (Ka/Ks values) in Arabidopsis thaliana. Lower Ka/Ks values in the TIR than non-TIR sequences suggest greater functional constraints in the TIR subfamily. At least three of the five identified ancient clades appear to predate the angiosperm-gymnosperm radiation. Monocot sequences are absent from the TIR subfamily, as observed in previous studies. In both subfamilies, clades with sequences separated by approximately 150 PAM units in both subfamilies are family- but not genus-specific, providing a rough measure of minimum dates for the first diversification event within these clades. Within any one clade, particular taxa may be dramatically over or under-represented, suggesting preferential expansions or losses of certain RGH types within particular taxa, and suggesting that no one species will provide models for all major sequence types in other taxa.