![]() |
![]() |
![]() |
|
David
Eisenberg: |
||
|
Probably nowhere is the explosion of information more inspiring -- or more daunting -- than in the field of molecular biology. A decade-and-a-half ago, just characterizing the molecular basis of a single gene could take a researcher years. Today, biologists are systematically churning out the sequences of every gene of entire organisms. The first came in 1995 -- a bacterium known as haemophilus influenza. Now there's more than two dozen completed genomes, a number that will grow to 100 by the end of next year. The complete human genome -- the sequences of 100,000 or so genes -- will follow shortly thereafter. But what to do with all this information? Each of these genes codes for a single protein, and proteins constitute the molecular machines of our bodies and the components of most cellular structures. As David Eisenberg, director of the UCLA-DOE Laboratory of Structural Biology and Molecular Medicine, notes, each protein has a specific function, a function that is established not so much by the string of amino acids in the protein but by the specific three-dimensional shape (its "confirmation") that it takes on after it's created. "Our bodies contain about 100,000 different kinds of proteins," explains Eisenberg. "Our goal in this laboratory is to learn for as many as these as possible their three-dimensional structure and their function -- what do they do in the body?" The traditional method for understanding protein function was remarkably time-consuming. Researchers would separate a single protein from the cells in which it lives and then grow a crystal of that protein. Then using X-rays, the researchers would create an enlarged picture of the protein. From the picture, they would try to figure out the structure and then the function. Eisenberg's first protein -- an enzyme called glutamine synthetase -- took him 20 years to characterize. But now, researchers have this avalanche of genomes -- each providing the sequences of thousands of genes, from which it is relatively easy to establish the sequences of the thousands of matching proteins. The problem for Eisenberg and his collaborators, Todd Yates and Chris Lee in the Department of Chemistry and Biochemistry, is figuring out the function of all these proteins. To solve the problem, the three researchers -- working specifically on the proteins found in yeast, a fully sequenced genome with about 6,200 proteins -- created two new methods to learn protein functions: the "phylogenic-profile" and the "rosetta-stone" methods. The phylogenic-profile method requires looking for pairs of proteins that are always found together in organisms. "They always move together from organism to organism during evolution," says Eisenberg. "And proteins that always move together, function together. If we know the function of one, then we have a clue as to the function of the other as well. So we download information on genomic sequences from those Web sites where sequences are posted, and then we analyze which proteins are in every organism, looking for these pairs of proteins that always move together."
|
The rosetta-stone method also requires looking for protein pairs. However, in this case, they look for proteins that are distinct in one organism but fused together in another. For instance, two proteins in the fly genome might be found as a single extra-long protein in the worm genome. "If they're fused together in the second organism, it suggests to us that they function together in the first," explains Eisenberg. "We call those fused-together proteins 'rosetta-stone sequences' because they link together the two distinct proteins from the first organism." Using such methods, Eisenberg and his colleagues have already figured out the likely functions for over half of the 2,500 yeast proteins for which no function was known. Such sequences are "astoundingly common," as Eisenberg puts it. In fact, the team has also found some 50,000 rosetta-stone sequences in organisms, each of which is composed of two fused sequences from the 6,200 proteins of the yeast genome. "That suggests that each yeast protein interacts with on the order of 10 other proteins in the cell," says Eisenberg. The researchers hypothesize that the proteins start off as one big protein then split into two or more as the organisms evolve. "We're eager to march through these genomes and figure out what proteins are doing and which proteins interact with which other proteins," says Eisenberg. "It is a way of going from studying one protein at a time to studying whole networks of interacting proteins. This in turn will lead us to studies of how these networks of proteins function in cells." Learning to read such data has led to surprising discoveries. Among those is one that perhaps has far-reaching consequences for humankind. Eisenberg and his colleagues discovered that a family of proteins found in yeast might have human versions -- known as "homologues" -- that are involved in colorectal cancer, one of the leading causes of cancer deaths in the United States. "We've shown that this new family of proteins is linked to two other families of proteins known to have human homologues involved in colorectal cancer," says Eisenberg. "Our hypothesis is that this new family is involved also." If that turns out to be the case, it's conceivable researchers could find ways to inhibit the action of these proteins to block the biochemical pathway leading to cancer. Such is the value of finding knowledge in information. As Eisenberg notes: "That goes far beyond our research. But we're hoping that the application of our methods to humans, when the human genome is available, will give many such valuable insights." As well as second lives. |
"We're eager to march through these genomes and figure out what proteins are doing and which proteins interact with other proteins." |