Princeton - PWB 092799 - Math helps explain protein folding

Princeton
Weekly Bulletin
September 27, 1999
Vol. 89, No. 3

[Page one]
• Shapiro offers fanfare for '03
• Prizes reward excellence
• Math helps explain protein folding
• Macedo to direct program in Law, Public Affairs
• Registrar Broh to go to COFHE
• Summer labors
• Water, water everywhere
• People
• Athletics
• ERISA information
• Grants available
• Employment
• Calendar

Math helps explain protein folding

Christodoulos Floudas (Photo by Pryde Brown)

By Steven Schultz

For Christodoulos Floudas, professor of chemical engineering, the watchwords of his research are "divide and conquer."

This has nothing to do with heavy-handed politics -- it's a bold mathematical technique he is applying to a question that has tantalized scientists for years: how do proteins fold?

Understanding the way proteins fold -- what three-dimensional shape they have -- is a problem that rears its head in fields from biology and chemistry to engineering and math and has major implications for medical science. The bumps and valleys in a protein's shape determine how it reacts with other proteins.

The shapes also give drug makers clues about how to design medicines that latch onto a bump or valley and interfere with the proteins involved in disease. Comprehending the way proteins fold could result in new medicines, new understanding of disease and insights into the basic mechanisms of life.

Fundamental elements of life

Proteins are fundamental elements of life. They send the messages, transport the chemicals and build the structures that allow living things to exist. Scientists have known for a long time the precise chemical makeup of proteins. But that's only half the story. What really determines the function of a particular protein is the precise and intricate way it is folded.

28-residue segment of Zinc finger DNA binding protein

58-residue Bovine Proteinase Trypsin Inhibitor

When cells manufacture proteins, they start off as long strings of chemical blocks but quickly fold up into ringlets and loops like ribbons on a birthday present. What puzzles scientists is that the sequence of chemicals in a protein does not offer obvious clues about the shape that the protein assumes. For example, if one protein includes chemicals A, B and C and they bend into a right angle, another protein may have the same ABC sequence in a straight line or loop. For decades, scientists have tried to understand why a particular chemical sequence leads to a particular fold. Their ultimate goal is to predict the complete three-dimensional structure of a protein based solely on knowing the sequence of chemicals that go into it.

"It is a fascinating problem, because nature does it so quickly and efficiently, and we all fail to predict it," says Floudas, who is one of a growing number of scientists trying to tackle the problem using mathematics, rather than experimental methods.

The mathematical approach is part of a field of applied mathematics called optimization. This is a process of looking at complex situations affected by many variables and understanding what values for the variables will produce the desired result. For example, transportation engineers could use optimization to create the most efficient traffic flow on a set of city streets. With proteins, the variables are the positions of atoms of the chemical units.

The chemical units in proteins, the As, Bs and Cs, are amino acids; there are 20 amino acids that can be used in any combination. A typical protein may be made of hundreds of amino acids, but for the purposes of studying protein folding, scientists often look at relatively short segments, called peptides. Even a pentapeptide, a protein with just five amino acids, could fold in any of 100 billion possible structures, says Floudas. "How can you develop accurate mathematical models that capture all the interactions?" he asks.

How much energy is needed

Like many scientists, Floudas thinks about protein folding in terms of how much energy is needed to hold a protein together in a particular shape. A folded protein is analogous to a person sitting crosslegged on the floor, compared to the person standing bent over at the waist; one position requires less energy than the other. In theory, the same protein could assume many different stable, low-energy shapes. In nature, however, proteins always head down the one path that leads to the very lowest level of energy. Scientists want to know which path nature will select. Which is the lowest of the low? There are far too many to compare one at a time.

That's where the principle of "divide and conquer" comes in. Floudas has figured out a way to divide the problem in half repeatedly, so that he searches for the highest energy conformations and then looks for the theoretical lowest limit on what the protein's energy could be, a number that is below the actual lowest-energy state. He then uses a mathematical process that makes the two numbers converge on the correct answer. In technical language this method is called "branch and bound global optimization approach." Floudas says it provides a "theoretical guarantee for locating the global solution to a given problem."

Floudas has had good preliminary results with a 28-amino-acid protein and is now working on one with 58 amino acids. One validation of his method came last year when a group from Columbia University used it to make good predictions about the shape of several proteins ranging in size from 54 to 183 amino acids. "There is strong evidence we're on the right track," says Floudas.

Computational conference

"Chris's work is an important new contribution to this field," says Ken Dill, professor of pharmaceutical chemistry at the University of California, San Francisco. Dill was a keynote speaker at a conference on computational approaches to protein folding, co-organized by Floudas and held at Princeton in May 1999. The conference drew more attendees than expected, says Floudas, and spurred more than the usual number of questions for such a technical gathering. "We could not finish all the questions," he notes.

According to Dill, Floudas is the first to use rigorous computing methods to achieve global optimization ("global" because it doesn't just find a good answer, it finds the best answer). However, Dill believes there is still a lot of work to be done. He likens the protein-folding problem to finding a needle in a haystack: what Floudas has done is find a very good way of sorting through all the hay; what is not yet clear is how good is the description of the needle itself. In other words, Dill asks, do computational modelers have a good enough description of how much energy is associated with a particular shape?

Despite these unresolved issues, Floudas' techniques already are finding practical applications. He is collaborating with two groups at the University of Pennsylvania that are using his optimization tools to tackle specific medical research questions, one having to do with how the immune system works and the other with the working of the protein compstatin. These collaborations combine Floudas' mathematical approach with more commonly used experimental methods.

Experimental methods

One person at Princeton using experimental methods to explore protein folding is Professor of Chemistry Jannette Carey. She is skeptical of computational methods -- in part, she says, because the computational model relies on a two-step hierarchy in protein folding.

According to this model, which is described in current textbooks, proteins first form a series of helices and straight strands, called secondary structures, and then these shapes fold back on one another to form the final shape, the tertiary structure. Carey says that her work has shown that, contrary to the textbook description, secondary and tertiary features can form at the same time, and that tertiary features often influence the secondary shapes. One result, she feels, is that the computational model, which relies on the two-step hierarchy, may be oversimplified.

Floudas acknowledges that it is too early to claim the problem has been solved, or that optimization can solve all the problems, but he believes his approach includes several technical elements that others omit, and the results are getting better and better.

"I'm encouraged," he says, "because rigorous computational methods can advance our understanding of protein structure prediction and elucidate the dynamics of protein folding."