PALO ALTO, Calif. Biotechnology companies are swamped by a rising tide of data on genes and proteins that could influence human illnesses, with all too little information about how those interactions work and how to harness them.
Dealing with the immediate challenge requires a shift to what researchers are calling systems biology, an emerging field of math-based predictive biology.
Underscoring the problem, one project at Bayer Biotechnology spanned more than three years, tested thousands of proteins and found some 38 target genes or proteins worth investigating. But it failed to generate a single lead for a clinical trial.
"I'm up to my eyes in targets," said Ken Kupfer, head of scientific informatics at Bayer's office in Berkeley, Calif. "The time it takes us to develop the depth of understanding we need blows our business models."
The problem is expanding as researchers begin to study the underlying proteins that make up genes. "We are in severe danger of swamping the systems. The anaconda has just started swallowing the antelope of genomics, and behind it is coming the elephant of proteomics," said Jeremy Levin, head of strategic alliances for Novartis Institutes of Biomedical Research Inc. (Cambridge, Mass.).
"The microarray chips are generating all this data, and the pharmaceutical companies don't know what to do with it all," said Richard Popp, director of ethics and policy for the biodesign program at Stanford University. Popp was referring to a growing class of lab-on-a-chip devices that analyze genes and proteins.
Even so, the microarray chips used to analyze biological samples today "only read a small part of the problem. It's like sweeping the ocean with a wide-mesh net to determine the ecology of the bottom of the sea," said Jeff Augen, chief executive officer of TurboWorx Inc. (Shelton, Conn.), one of several vendors of software tools at BioSilico.
Kupfer of Bayer said that some good tools are emerging, but storage remains the most painful bottleneck. "Right now we can extract something from the raw data," said Kupfer. "The question is, how confidant are we that we will no longer have to go back to the raw data anymore? That's what's blowing the budget."
A spokesman agreed at CuraGen Corp. (New Haven, Conn.), which is developing diagnostic chips, DNA sequencers and an online clinical database now resident on a 20-Tbyte storage-area network that's growing by 10 Tbytes a year. "It's getting very large very fast," he said.
Biotech companies are feeling the commercial pressures. As a group, the estimated 220 public biotech firms reported losses totaling about $12.5 billion last year, said John Vitalie, a regional vice president of Nasdaq, where many of the companies are listed. He estimated the group as a whole could become profitable by 2010.
"I for one would not currently invest in the current bioinformatics model [which combines biology and computer science] as it exists today, but I am hopeful that in five to 10 years there will be new models," said Nandini Tandon, a partner in life sciences venture financing at RBC Capital Partners (San Francisco).
Adding to the troubles, biotech companies as a group are expected to reduce R&D spending by about 18 percent this year, though the return on R&D investment should improve, said Vitalie of Nasdaq. Nevertheless, others point out that the R&D reduction comes as the costs of clinical trials are rising significantly.
Clusters hold sway
Robert Bishop, CEO of technical-computer maker SGI (Mountain View, Calif.), said the biotech industry faces a mismatch in its growing technical hurdles and shrinking R&D budgets. "That's driving everyone down to PCs and clusters. The result is a lowest-common-denominator approach, and the biggest problems don't get solved," he said.
Nevertheless, the U.S. Department of Energy is putting federal research dollars into computational biology, Bishop noted. And the Department of Homeland Security is packing the chemical signatures of chemical and biological agents it wants to quickly identify into a 100-Tbyte shared-memory system.
"You'll see these large shared-memory systems show up in more and more places in biology," Bishop said.
Still, most observers said, large multiprocessing systems like those from SGI are losing ground to lower-cost clusters of commodity PCs based on the Linux operating system.
"That's probably 70 to 80 percent of what we are working with," said Joseph Donahue, president of genomic software tool vendor Lion Bioscience Inc. (Cambridge, Mass.).
"People don't want to build custom systems anymore," said Bud Tribble, vice president of software technology at Apple Computer Inc., which is renewing its focus on life sciences with its 64-bit G5 desktop systems.
Virginia Tech is constructing a cluster of 1,100 G5s. The machine is expected to be named the world's fourth most powerful supercomputer in rankings to be released this month.
"Most of these [biotech] problems don't scale well by taking them to a large SMP [symmetric-multiprocessing system]; they scale better by leveraging their embarrassingly parallel nature with clusters," said Augen of TurboWorx.
Despite the hurdles and technical issues, the biotech field still holds plenty of promise for those with patience as the basic science unfolds.
"I believe biologists working in tandem with specialists in other fields such as math, computer science and engineering will make some of the most important scientific findings in history," said James H. Clark, founder of Silicon Graphics Inc. and Netscape Communications Corp., at the recent dedication of Stanford University's bioengineering center, which bears his name. Clark contributed $90 million to the center, which houses biologists, chemists, computer scientists and engineers.
MIT's heady offering
Three thousand miles away, a similar initiative at MIT is preparing to offer one of the first doctorate-level programs in computational biology. "MIT is currently the only place I know of that plans to offer a PhD program in this field," said Brigitta Tadmor, who co-chairs MIT's initiative, which includes 18 professors from its computer science and electronics-engineering department. "We plan to launch it officially in the fall of 2004."
Similar bioengineering departments are coming together at Cornell, Harvard, Princeton and many other universities to push forward the convergence of math, biology, computers and engineering.
"Biology is really about digital information, and systems biology is going to tell us how information operates in biological systems," said Leroy Hood, president of the Institute of Systems Biology (Seattle), speaking at the BioSilico conference. "We want to develop mathematical models that would predict the behavior of biological systems. To do this work you have to have leading-edge technologies and tools."
Hood, who developed an early DNA analyzer that enabled the human genome sequencing project, is now collaborating on a nanolab-on-a-chip that could use nanosensing wires and microfluidics to carry out several hundred or even a few thousand tests on a group of 100 cells.
With the advent of such tools, Hood predicts that over the next two decades, advances in systems biology will ultimately expand the human life span by 10 to 30 years. "That will have significant social, moral and ethical challenges for medicine, insurance, education and other fields," he said.
In this still-young phase, biotech needs to be science-driven, focused on making and testing fresh theories rather than using computers to catalog data the focus of many current bioinformatics companies, said Martin Gerstel, chairman of biotech firm Compugen Ltd. (Tel Aviv, Israel).
"There will be no need for bioinformatics in five years, but there is a great need for theoretical biologists," Gerstel said. Unfortunately, "there is almost no money being put into building and validating predictive models in life science. The financial people just don't want to hear about it."