The power of interconnected computers has taken molecular modeling from the Pony Express to the cloud
It was only 50 years ago, but it could have been 500. In the 1960s, academic computational chemists shared their computer programs via a Pony Express-type service run by scientists at Indiana University, Bloomington, called the Quantum Chemistry Program Exchange.
Members learned about new software in circulated newsletters, and for a small fee, they could order the programs’ source codes, which were sent by mail on computer punch cards or magnetic tape.
Doing the actual computational science was just as ponderous, recalls Henry Rzepa, a chemistry professor at Imperial College London. As a graduate student at the University of
Texas, Austin, in the 1970s, Rzepa spent days in a dedicated computation center, wrestling with punch cards. “It was a lot of tedious, repetitious work, punctuated by the occasional discovery,” he says.
Then came the 1980s. The growing development of the Internet swiftly made such tortured, slow communication and scientific progress a distant memory. The seemingly simple act of connecting computers to one another completely transformed the computational landscape, eventually leading to today’s ability to perform molecular calculations on demand, with almost limitless computing power.
Thirty years ago, though, few laypeople had e-mail, let alone dial-up modems. But that didn’t stop academic institutions from anticipating the massive scientific paradigm shift that was about to occur.
In 1985, for example, a consortium of Dutch chemists formed the Dutch National Facility for Computer Assisted Organic Synthesis & Computer Assisted Molecular Modelling. The center developed ways to link together computers at distant facilities. After attending a 1987 conference in the Netherlands titled “Chemical Structures: The International Language of Chemistry,” attendees reported the design of a user-friendly graphics menu interface that allowed “even the novice user direct choice.”
But it was the World Wide Web that really opened up the floodgates to progress in computational chemistry, Rzepa says. In 1994, Rzepa and his colleagues published a prescient paper in Chemical Communications, “Chemical Applications of the World-Wide-Web System” (DOI: 10.1039/c39940001907).
Suddenly, chemists could turn the scads of numbers—bond angles, dipole moments, and the like—they’d been using to represent molecules into two- and three-dimensional pictures. Rzepa credits in particular the open-source molecular structure viewing program Jmol for harnessing the power of the Web, “showing how you could take computational chemistry software and a Web browser and convert it into rotating pictures.”
The World Wide Web also allowed scientists to harness the power of personal computers that had entered homes en masse since the 1980s. Most computers spend a majority of their time sitting idly, their processors unused. Instead, scientists realized, these computers could be performing small tasks during their downtime, sending results to a central computing center. The collected results could then be used to solve big problems. This is a strategy now known as distributed computing.
In 1999, scientists at the University of California, Berkeley, famously launched SETI@home, in which people volunteered to use their home computers to analyze radio telescope data for signs of intelligent life elsewhere in the universe.
Around that time, Vijay Pande was just starting his career as an assistant chemistry professor at Stanford University. “I wanted to do something big,” he says. “The limiting factor in computation was the paucity of computer power.”
Pande recognized the potential for distributed computing to solve complicated computational problems in chemistry. He developed methods to break up large calculations into many small ones to predict how a protein folds. His lab launched Folding@home in October 2000.
Fifteen years later, Folding@home is still going strong, with more than 140,000 participants. It has been joined by numerous other distributed computing projects such as Rosetta@home, which predicts protein structures, and climateprediction.net, which models climate change (C&EN, April 2, 2007, page 62).
Meanwhile, in the 1990s, academicians and governments began connecting large, geographically distant computer clusters, creating so-called grids. Grids could be used by many different groups and gave scientists unprecedented computing power without having to build their own supercomputing facilities.
These grids’ more commercial cousin, what is now called “the cloud,” also makes use of large systems of linked computers. Unlike systems of linked supercomputers, which require time sharing, the cloud is a tremendous flexible resource, providing as much on-demand computing power for as long as it’s needed. Largely run by companies such as Amazon or Google, the cloud offers even less technological commitment on the part of a scientist. Pharmaceutical companies have embraced the cloud, purchasing cloud computing time to search drug databases or to perform docking calculations on compound libraries.
Today, most scientists, even academic researchers, agree the future of computational chemistry lies largely in the cloud.
Paul Davie, who manages the Cambridge Crystallographic Data Centre’s site at Rutgers University, sees access to the cloud as a “game changer” for smaller biotech companies. In the cloud, these companies have at their hands a wealth of computing resources without having to invest in a large computer. “It’s like renting a good hotel room instead of buying a house,” he says.
Initially, Davie says, pharmaceutical and biotech companies balked at the idea of the cloud, in part because of security concerns. Then, they realized that companies such as Amazon have invested a tremendous amount in security. “Their reputation depends on security resources,” Davie says. “I think that’s been accepted.”
Of course the cloud can’t solve every chemical problem. Some types of problems, such as lengthy molecular dynamics simulations, will always require frequent communication among the speedy processors of supercomputers.
Still, academic chemists, Pande says, are also realizing the benefits of the cloud’s instant availability for short-term projects. “Universities don’t build their own phone systems,” Pande observes, so “there’s no reason to put together their own [computer] clusters—especially when companies are doing it at extremely low cost.”