Understanding the composition, evolution, and function of the Gossypium hirsutum (cotton) genome is complicated by the joint presence of two genomes in its nucleus (AT and DT genomes). These two genomes were derived from progenitor A-genome and D-genome diploids involved in ancestral allopolyploidization. To better understand the allopolyploid genome, we re-sequenced the genomes of extant diploid relatives that contain the A1 (Gossypium herbaceum), A2 (Gossypium arboreum), or D5 (Gossypium raimondii) genomes. We conducted a comparative analysis using deep re-sequencing of multiple accessions of each diploid species and identified 24 million SNPs between the A-diploid and D-diploid genomes. These analyses facilitated the construction of a robust index of conserved SNPs between the A-genomes and D-genomes at all detected polymorphic loci. This index is widely applicable for read mapping efforts of other diploid and allopolyploid Gossypium accessions. Further analysis also revealed locations of putative duplications and deletions in the A-genome relative to the D-genome reference sequence. The approximately 25,400 deleted regions included more than 50% deletion of 978 genes, including many involved with starch synthesis. In the polyploid genome, we also detected 1,472 conversion events between homoeologous chromosomes, including events that overlapped 113 genes. Continued characterization of the Gossypium genomes will further enhance our ability to manipulate fiber and agronomic production of cotton.
Available at: http://works.bepress.com/jonathan_wendel/60/