RSS Feed: TS-Si News Service. RSS Feed: TS-Si Research Service. TS-Si Reader Comments. Delicious: TS-Si News Service. Digg: TS-Si News Service.
Pinterest.
StumbleUpon. Facebook: TS-Si News Service.
GooglePlus: TS-Si News Service.
Twitter: Follow TS-Si News Service.
Leave a comment.
xkcd
Campaigns
Sexual Assault Awareness Month (SAAM).

Sexual Assault Awareness Month. The goal of SAAM is to raise public awareness about sexual violence and to educate communities and individuals on how to prevent sexual violence.

National Sexual Violence Resource Center serves as the comprehensive resource center on sexual violence and its prevention, and sponsors SAAM each April.
Please donate to the Maetreum of Cybele.

The Maetreum of Cybele needs your help in their fight for religious freedom.



is dedicated to the acceptance, medical
treatment, and legal
protection of individuals correcting the misalignment
of their brains and their anatomical sex, while supporting their transition
into society as hormonally reconstituted and surgically corrected citizens.
Charting Relatedness of Living Things with New Speed and Detail Print E-mail
SciMed - Biology
TS-Si News Service   
Saturday, 20 June 2009 15:00

Tree of Life

Austin, TX, USA. Detailed, accurate evolutionary trees that reveal the relatedness of living things can now be determined much faster and for thousands of species with a computing method developed by computer scientists and a biologist at the University of Texas at Austin.

Since Charles Darwin, biologists have constructed evolutionary trees to explain the relatedness of plants, animals and other organisms. The science of figuring out these trees, known as systematics, has progressed significantly in the last two decades largely due to advances in computation, genetics and molecular biology.

However, many of the relationships among the world's 1.5 million described species (the true number could be 10 million or more) remain to be figured out, and surprises still remain. Figuring out these relationships requires analyzing large amounts of molecular data, such as DNA and protein sequences. The research team reports their new method in the journal Science.

Bioinformatics

Bioinformatics applies information technology (IT) to molecular biology. Paulien Hogeweg coined the term in 1978.

The rapid development of genomic and other molecular research technologies have combined with IT to produce very large quantities of complex data and information.

Bioinformatics exploits this situation by the development and application of computationally intensive techniques (e.g., data mining, and machine learning algorithms).

The field entails theory development, the creation and advancement of algorithms, computational and statistical techniques, and databases to solve formal and practical problems that arise when managing and analyzing biological data.

Bioinformatics was applied in the creation and maintenance of a biological information database at the beginning of genomic investigations (e.g., nucleotide and amino acid sequences). Database development involved technical design issues and development of complex new interfaces, enabling data submissions and access.

Common activities include mapping and analyzing DNA and protein sequences, gene finding and genome assembly, protein structure alignment and prediction, aligning different DNA and protein sequences for comparison, creating and viewing 3-D models of protein structures, and modeling evolutionary interrelationships.

Software tools range from simple command-line access to more complex graphical programs and standalone web-services available from various bioinformatics companies or public institutions.

The tool best-known among biologists may be BLAST, one of a number of generally available programs for doing sequence alignment. An algorithm determines the similarity of arbitrary sequences against other sequences (from curated databases of protein or DNA sequences).

The US National Center for Biotechnology Information (NCBI) provides a popular web-based implementation that searches their databases.

Current initiatives, such as SATé from the University of Texas at Austin have further extended the speed, accuracy, and detail obtained from such searches.
The evolutionary relatedness of different species or groups of species are depicted in something called a phylogenetic tree. Scientists arive at a mathematical and visual depiction by analyzing genetic sequences for their the similarities and differences. Progress depends on solving two interrelated problems: the alignment of genetic sequences and inferring their evolutionary relatedness (phylogenetic inference). The research team has describes a method for using very large data sets to estimate both the sequence alignments and phylogenetic trees.

Tandy Warnow

Computer scientist Tandy Warnow, biologist Randy Linder and their graduate students created an automated computing method, called SATé, that analyzes molecular data from thousands of organisms, simultaneously figuring out how the sequences should be organized and computing their evolutionary relatedness in as little as 24 hours.Previous simultaneous methods like Warnow and Linder's have been limited to analyzing 20 species or fewer and have taken months to complete.

"SATé could completely change the practice of making evolutionary trees and revolutionize our understanding of evolution," says Warnow, professor of computer science and lead author of the study.

In addition, SATé can accurately analyze DNA sequences that are rapidly evolving. These sequences have been previously avoided due to concern that the resulting trees would be poor.

Randal (Randy) Linder

Before a tree, or phylogeny, can be determined, DNA and protein sequence data must be organized. This process is called alignment. Key to Warnow and Linder's program is its ability to quickly and accurately align these data.

"Our process is novel because it rapidly and simultaneously aligns sequences and looks for the best phylogenies," says Linder. Integrative biology, Linder's specialty, is primarily focused on structure and function in the evolution of diverse biological systems.

"The old way of doing this for a large number of sequences was basically to align the data once, but we can look at many arrangements to find better ones." This is important because different alignments can lead to significantly different phylogenies, and scientists must find the phylogeny that best represents the evolutionary relationships among the species in question.

For their paper, Warnow, Linder and their students tested SATé using computer-generated data and real biological data. The biological data had been previously aligned manually by other experts. The new phylogenies closely match those existing, both validating the method's potential, and, in some cases, validating the evolutionary trees themselves.

Michael Braun

"Warnow and Linder have created a method that speeds up the process and removes any subjectivity," says Michael Braun, an evolutionary biologist at the Smithsonian Institution not associated with this project."

This is a major step forward for evolutionary biology."

"Instead of doing things by hand, evolutionary biologists can now trust our automated program," says Warnow.

"It will enable the creation of much more accurate trees, especially for the Tree of Life, which deals with hundreds of thousands of gene sequences from the millions of species on Earth."

ParticipantsComputer science graduate student Kevin Liu is first author on the paper. Students Sindhu Raghavan and Serita Nelesen also contributed to the project and co-authored the paper.
CitationRapid and Accurate Large-Scale Coestimation of Sequence Alignments and Phylogenetic Trees. Kevin Liu, Sindhu Raghavan, Serita Nelesen, C. Randal Linder, and Tandy Warnow. Science 324(5934): 1561-1564. doi: 10.1126/science.1171243

Abstract

Inferring an accurate evolutionary tree of life requires high-quality alignments of molecular sequence data sets from large numbers of species. However, this task is often difficult, slow, and idiosyncratic, especially when the sequences are highly diverged or include high rates of insertions and deletions (collectively known as indels). We present SATé (simultaneous alignment and tree estimation), an automated method to quickly and accurately estimate both DNA alignments and trees with the maximum likelihood criterion. In our study, it improved tree and alignment accuracy compared to the best two-phase methods currently available for data sets of up to 1000 sequences, showing that coestimation can be both rapid and accurate in phylogenetic studies.

TS-Si News Service.The TS-Si News Service is a collaborative effort by TS-Si.org editors, contributors, and corresponding institutions. Sources can include the cited individuals and organizations, as well as TS-Si.org staff contributions. Articles and news reports do not necessarily convey official positions of TS-Si, its partners, or affiliates. We welcome your comments. Use the form below to leave a public comment or send private correspondence via the TS-Si Contact Page. We will not divulge any personal details or place you on a mailing list without your permission.


TS-Si is dedicated to the acceptance, medical treatment, and legal protection of individuals correcting the misalignment of their brains and their anatomical sex, while supporting their transition into society as hormonally reconstituted and surgically corrected citizens.


Comments (0)Add Comment

Write comment
smaller | bigger

busy
Last Updated on Saturday, 20 June 2009 15:10