| Abstract Detail
Systematics Section / ASPT Sanderson, Michael [1]. Phylota Browser: software to support phylogenetic inference from GenBank. Molecular sequence data for 160,000 species, including 58,000 green plants is archived in NCBI’s GenBank, providing a rich source of data for phylogenetic inference. We have constructed a web-accessible database of a taxonomically enriched subset of data for eukaryotes in GenBank, which is tailored for phylogenetics users. This Phylota Browser was assembled by building clusters of locally homologous sequences from all-versus-all BLAST searches for each node in NCBI’s taxonomy tree. A data availability matrix for each node is constructed, which reports whether a given cluster-by-taxon entry has a sequence in the database. This provides a useful view on what new sequences must be obtained to complete the matrix for subsequent supermatrix or supertree construction. Smaller and more complete subsets of the data can be parsed out manually or via formal algorithms. Finally, the data can be downloaded for individual clusters or for sets of clusters for subsequent alignment and analysis. To illustrate the utility of the database, we examined 12,000 of these clusters distributed across all eukaryotes, built alignments with ClustalW and maximum parsimony bootstrap trees with PAUP, and then tallied a measure of support for each taxon in the NCBI tree using these phylogenetic results. This measure calculates the fraction of a clade’s taxa that are “well-supported”, meaning their summed measures of support across all the clusters they are found in exceeds a specified value. The distribution of this support can be plotted on the NCBI taxonomy tree to reveal areas that have received relatively more or less attention. In green plants, in addition to model organisms, conifers and an assortment of angiosperm clades, including several within Poaceae and Asteraceae, are relatively well supported by this measure. Log in to add this item to your schedule
Related Links: Link to Phylota Browser
1 - University of Arizona, Ecology and Evolutionary Biology, Tucson, AZ, 85721, USA
Keywords: phylogeny information content GenBank.
Presentation Type: Oral Paper:Papers for Sections Session: CP24 Location: Continental C/Hilton Date: Tuesday, July 10th, 2007 Time: 9:30 AM Number: CP24007 Abstract ID:1481 |