# DNA computers hold promise of massive parallelism

ORLANDO, Fla. — Researchers will gather here beginning July 13 to advance the state of DNA computing, a field that holds the promise of ultra-dense systems that pack megabytes of information into devices the size of a silicon transistor. While some researchers suggest DNA computing is an infant discipline looking to find its way into real-world applications, papers presented here at the Genetic and Evolutionary Computation Conference (GECC) will describe new techniques and tools for building and using DNA computers.

Today anybody can order any imaginable sequence of DNA "instructions" by mail order for $1 per instruction. What you get is a test tube with about a trillion copies of any specific DNA sequence you desire. Worldwide, about a dozen laboratories thus far have ordered experimental DNA sequences in an attempt to derive living computers.

The allure of DNA computing is the far denser packing of molecular information compared with silicon-based computers. A single bacterium cell measures just a micron square — about the same size as a single silicon transistor — but holds more than a megabyte of DNA memory and has all the computational structures to sense and respond to its environment.

The father of DNA computing, University of Southern California professor Leonard Adleman, spawned the field in 1994 with his seminal paper, "Molecular Computation of Solutions of Combinatorial Problems." Since then, Adleman has demonstrated how the massive parallelism of a trillion DNA strands can simultaneously attack different aspects of a computation to crack even the toughest combinatorial problems, such as the government's supposedly uncrackable Data Encryption Standard (DES).

"We have speculated that many problems could be solved by DNA computing, but so far we have only experimentally solved a few 'toy' problems — that is, problems that can be solved with pencil and paper," Adleman said. "The next step, which I believe will be achieved within the next year, is to experimentally demonstrate the solution with a DNA computer of a problem that can already be solved by conventional computers but that cannot be solved with pencil and paper."

The main thrust of current DNA computer enthusiasts is to identify the most promising architectural features needed to perform computations with DNA. The search is reminiscent of the days before the von Neumann computer, when people knew that vacuum tubes could perform computations but had not yet identified the separation between data and instructions that would become the hallmark architectural feature of digital computing.

"About 15 groups are actively doing DNA computer research worldwide, and most of these groups are looking for the right architectural features for doing such molecular computations. DNA is just one molecule with which these computers could eventually be made," said Adleman.

Just as vacuum tubes, rack panels and all the mechanical parts necessary to build the original computers were available to pre-von Neumann experimenters, today you can buy DNA and enzymes and all the machinery necessary to manipulate individual molecules. It's what Adleman calls the toolshed of molecular science.

"Some people are going to go into the toolshed of molecular science and cobble together a potholder that only their mother could be proud of, but others are going to build things of exquisite beauty and grace," Adleman said.

The first toy problems solved by DNA computations were Hamiltonian path problems, often called traveling-salesman problems. The objective is to find the optimal path by which to visit a fixed number of cities once each. The problem can be solved with pencil and paper if only a small number of cities are involved, but it explodes into a non-deterministic time problem (NP) when large numbers of cities are considered. On conventional computers, NP problems quickly become intractable because of the large number of possible paths that must be tested and compared.

But DNA computers can use their massive parallelism to find the optimal route among a large number of cities without trying out every possible combination one at a time. Instead, massive numbers of short DNA sequences representing each city are mixed together in solution. Each end of each city sequence is sticky, so that they become stuck together in long sequences representing every possible order in which cities could be visited.

Every possible route through the cities is generated at one time, usually in less than an hour in a test tube. The next task is to filter out the DNA sequences that start and end with the city of origin. Then the sequences with the correct number of stops — one per city — are filtered out. Finally, the sequences that visit each city only once are filtered out, yielding a set of optimal solutions.

Adleman performed agarose gel electrophoresis, ligation reactions and polymerase chain reactions to carry out those steps with real DNA sequences. He derived an optimal solution to a seven-city traveling-salesman problem in approximately one week. Unfortunately, you can solve the same problem on a piece of paper in about an hour — or by a digital computer in a few seconds.

But when the number of cities is increased to just 70, the problem becomes intractable for even a 1,000-Mips supercomputer. By contrast, the 70-city problem is a theoretical breeze for DNA computing, because while a single DNA molecule performs at only .001 Mips, a test tube full can perform at about 1 quadrillion Mips.

That scalability represents the promise that has attracted researchers to search for the best DNA computer architecture.

The ability to derive fast solutions to NP problems is theoretically applicable to a host of real-word applications. For instance, the routing problem by which telephone calls are switched from their source to their destination could be optimized.

Indeed, GECC workshop chairman David Harlan Wood, a University of Delaware professor, said DNA computing is a solution searching for a problem. "The field is still groping around for the most natural kind of problem for this kind of computation. Three problem areas have been tried, but there is no consensus yet as to which is best," said Wood.

Wood and his Delaware team are using DNA computing to attack a set of problems known as Max 1's. Here the "bits" of a DNA strand are encoded to represent the solution to the problem, the simplest of which would be all 1's.

"These problems can be performed in simulation on digital computers, but our group was the first to propose and carry out this kind of computation in the lab with real DNA strings," said Wood.

At the conference, Wood's group will present a paper summarizing its findings on implementing the Max 1's problem with real DNA.

Another session paper will describe a virtual DNA simulator that uses genetic algorithms to design the protocols by which real DNA computers can be implemented with off-the-shelf lab equipment. And the latest findings in solving the traveling-salesmen problem will be described in a paper on edge detection and recombination.

DNA computing is also being applied to a third set of problems, known as tiling problems, by California Institute of Technology (CalTech) post-doctoral researcher Eric Winfree. Tiling problems are similar to, but more complex than, the familiar map-coloring problem, which asks whether a finite set of colors is sufficient to color every country on a map without any two adjacent countries' having the same color.

Winfree worked with neural-network pioneer John Hopfield at CalTech early in his career but has since made a name for himself by pioneering the application of DNA computing to tiling problems.

"What I build are rectangular DNA molecules that have four sticky sides that can only stick to certain others. For instance, if one molecule has AACC on one side, then it will only stick to other DNA molecules with a complementary side — here, GGTT."

Winfree depends on the spontaneous self-assembly of large numbers of tiling molecules into what he calls desirable objects. "Self-assembly is capable of the same computations as a digital computer; you just have to set up the conditions correctly," he said.

The basic idea is to code a satisfiability problem into a set of tiles that self-assemble to make a random input. Another set of tiles is then released that evaluates the input to determine whether it solves the problem. Putting trillions of such self-assembling molecules into a liquid eventually allows the correct solution to be assembled and filtered out.

"Probabilistically, you can make sure you get an optimal solution by the sheer massive number of DNA molecules you use. The best problems for this approach are ones where you can verify the answer when you get it — what we call NP-complete problems. These are very hard to solve but are easy to verify once you have a solution," said Winfree.

All the "normal" types of knowledge that people routinely memorize, such as multiplication tables, can be easily encoded into self-assembling sets of DNA molecules, Winfree said. Each entry in such a liquid multiplication table would be encoded onto a DNA molecule. By making trillions of copies of each molecule for each entry into the table, one could ensure that the whole multiplication table would be available to every "test tube" computation.

"I see DNA computing to be as universal as mathematics itself," said Winfree. "It contains extremely beautiful ideas, just like mathematics, but it is not immediately obvious just where the applications of any part of it is put to best use — just like mathematics."