Mapping protein evolution one fold at a time

Author:  Ng Elizabeth
Date:  October 2007

In a study published this month in Genome Research, researchers at the University of Illinois documented the evolution of protein structure. Proteins are found in all living organisms and carry out some of the most important biological reactions, so understanding protein structure in an evolutionary context can be very powerful.

One application of this data is the analysis of metabolic pathways, a problem that the same team tackled in a separate study published this month in the Proceedings of the National Academy of Sciences.

"We are interested in how structure evolves, not how organisms evolve," said Caetano-Anollés, "We are using the techniques of phylogenetic analysis that systematicists used to build the tree of life, and we are applying it to a biochemical problem, a systems biology problem."

The team compiled data on protein structure from the Structural Classification of Proteins and the Kyoto Encyclopedia of Genes and Genomes and combined it with phylogenies, or family trees of proteins and their functions.

The researchers then categorized the various elements of each protein structure, called folds, based on similarity of structure and known enzymatic function. They then classified the folds by age, assuming that the most prevalent folds would also be the oldest.

When these classifications- age, fold structure, function, and the existing phylogenies- were combined to create a new "global" family tree of protein structure, it was found that many of the older folds were functioning in conjunction with much newer structures, indicating that older structures were recruited and altered to produce new functions.

Focusing on the oldest of the folds, the researchers found that of 776 metabolic protein folds surveyed, 16 were found to be omnipresent, and nine of those occurred in the earliest branches of the newly constructed tree.

"These nine ancient folds represent architectures of fundamental importance undisputedly encoded in a genetic core that can be traced back to the universal ancestor of the three superkingdoms of life," the authors wrote.

Although these three superkingdoms- archaea, bacteria, and eukaryota- share several fundamental folds, it was by the process of losing various fold superfamilies that these kingdoms became differentiated.

Interestingly, analysis found that these oldest protein folds were most closely related to RNA metabolism. These findings support the theory that RNA molecules were among the first biological catalysts.

Written by Elizabeth Ng

Reviewed by Matthew Getz

Published by Pooja Ghatalia.

Reviewed by