Thursday, May 1, 2008

Folding@Home: Johnny Molecular Biologist!

This afternoon I was having an IM conversation with someone about the Sony Playstation 3 - and after extolling a few of the system virtues, I off-handedly remarked that the real reason I bought it was that I like to cure cancer... which, of course, got the ??? in the IM window.

In a nutshell, here's the deal - if you're 900 years old like I am, you'll remember when SETI@Home was all the rage. (Ok, I'm being a dick. With 3Million users, SETI@Home is still all the rage.) In the mid-90's, a group of plucky astronomers at Berkeley realized that their distributed network funding required to complete the massively parallel channel search for signals from extraterrestrials wasn't going to come through, no matter how many Carl Sagan-inspired movies starring Jodie Foster appeared... however, all those new fangled 386's on those brand new cable modems were just sitting there downloading porn. Surely, there must be some better way of utilizing all of those spare cycles out in the world that didn't actually annoy the wife.

Installed as a screen saver on Windows boxes, SETI@Home was the first, true distributed computing application that made use of the real power of the internet: in very poetic turn, all of the people who were making use of the internet (which was created out of the ashes of DARPAnet, USEnet and EDUnet) to buy books from Amazon and pretend they are Brad Pitt in chat rooms, could now give back to the world by donating spare cycles on their computers in a massive cooperation effort to search for intelligent life beyond the earth.

Flash forward to 2000, and Prof. Vijay Pande at Stanford was having the same trouble: how to fund a computing effort that consumed massive quanitities of CPU cycles on a limited budget to crack the simulation of kinetics and thermodynamics of proteins and nucleic acids? The problem was this: Pande, and other molecular biologists, knew the sequences of atoms required to construct a protein if placed in the wrong order could contribute to amyloid-related illnesses such as Alzheimer's Disease. If they could change the protein back to the correct order, it would go far to helping cure the disease. But without a roadmap, molecular biologists have no idea how to change the sequencing to get a proper structure out of the protein. (It's a bit like knowing the pile of lego blocks scattered on the floor infront of you forms a race car, but the only way you have to put the car together is to try to assemble the blocks without a pattern.)

This act of bending and twisting and playing with the individual atoms of a protein molecule (a process called "folding") is a required "next step" to being able to predict the structure of the molecule. Fundamentally, understanding the process of protein folding — how biological molecules assemble themselves into a functional state — is one of the outstanding problems of molecular biology. Unfortunately, the myrad ways that a the atoms of a protein molecule can be arranged is legion.

Professor Pande, taking a cue from the extraterrestial hunters on the other side of San Francisco, created Folding@Home in 2000, which - also installed as a screen saver - used the untapped power of idle home PCs to brute force the folding problem: literally trying every possible combination of atomic sequencing in a protein molecule to see what works.
On September 16, 2007, the Folding@Home project officially attained a performance level higher than one petaFLOPS, becoming the first computing system of any kind to hit that kind of peak performance. (Uh, a petaFLOP is 1,000 TeraFlops...and, uh, a TeraFlop is 1,000 MegaFlops...which

However, it wasn't enough - there was still far more work to do...

...enter the First Person Shooter crowd. With rapid advances in GPU (Graphic Processing Unit) and CPU (Central Processing Unit) speeds, modern gaming consoles like Microsoft's XBox 360 and Sony's Playstation 3 have power to burn: the sustained speed of a PS3 at full tilt is 30,191MFlops. Pande and crew began to salivate. Most gamers only use their machines several hours a week, the rest of the time the machine remain idle or off.

Pande's group (Uh, really. His homies are now officially known as the Pande Group inside Stanford) approached Sony and struck a deal - the Sony Entertainment group would slap a user interface on the project worthy of the PS3 interface, the application couild be downloaded from the Sony Marketplace and installed by the user (sort of a forced "opt in"), and the user was free to run the app in the background when the system would otherwise be idle. There is a cost to the user for this, of course - the GPU and CPU would be always working, which chews up a bit more of the home electric bill. In the end, however, a typical user feels good about the project, Sony looks like an altrusitic uncle, and Pande gets to fold and unfold his proteins until the cows come home...which is good, because bovine spongiform encephalopathy (uh, mad cow disease) is one of the nuts that can be cracked by protein folding.

Below is a 5 minute video where I walk through the current version of Folding@Home on the Sony PS3.

Oh, by the way - the Folding@Home computing cluser, since going Ps3 in 2007 and hitting 1 million users in Febuary of 2008 - currently operates above 1 Petaflop at all times. That's a lot of Grand Theft Auto, kids.

As promised, here's the links of interest:

Vijay Pande's Folding@Home blog
Berkeley's SETI@Home blog
Wikipedia's explanation of the protein folding problem

Image at top of posting courtesy of
Peter G. Wolynes,
Department of Chemistry & Biochemistry, University of California - San Diego

Protein Folding Explained

No comments: