LONDON – Processing circuitry that has been designed to allow imprecision has been shown to be 15 times more power efficient than conventional circuitry at performing some tasks. The technique is particularly applicable in audio and graphical subsystems and such circuits could start showing up in hearing aids and tablet computers in 2013, researchers said.
A prototype processor was built by a team of researchers from Rice University in Houston, Singapore’s Nanyang Technological University (NTU), Switzerland’s Center for Electronics and Microtechnology (CSEM) and the University of California, Berkeley and reported on at the ACM International Conference on Computing Frontiers in Cagliari, Italy, where it won best paper.
The concept, which has echoes of fuzzy logic and probablistic processing, is straightforward: allow hardware for operations such as multiplication and adding to make mistakes but manage the probability of errors building up.
The team has used a number of techniques to deviate from conventional full-precision, absolute accuracy circuits. These include "pruning" where the team cuts away rarely used portions of a digital circuit and "confined voltage scaling."
In initial simulations published in 2011, the team showed how "pruned" sections of conventionally designed chips could run twice as fast and be half the size and use half the energy of the originals. In the latest research the team has implemented their ideas on a prototype silicon chip.
"In the latest tests, we showed that pruning could cut energy demands 3.5 times with chips that deviated from the correct value by an average of 0.25 percent," said study co-author Avinash Lingamneni, a Rice graduate student. "When we factored in size and speed gains, these chips were 7.5 times more efficient than regular chips. Chips that got wrong answers with a larger deviation of about 8 percent were up to 15 times more efficient."
Christian Enz, who leads the CSEM arm of the collaboration, said: "Particular types of applications can tolerate quite a bit of error. For example, the human eye has a built-in mechanism for error correction. We used inexact adders to process images and found that relative errors up to 0.54 percent were almost indiscernible, and relative errors as high as 7.5 percent still produced discernible images."
Project leader Krishna Palem, who also serves as director of the Rice-NTU Institute for Sustainable and Applied Infodynamics (ISAID) said initial applications for pruning are likely to be in application-specific processors, used in hearing aids, cameras and other electronic devices.
Inexact hardware is also being considered for ISAID's I-slate educational tablet, which is designed for Indian classrooms. Pruned chips are expected to cut power requirements in half and allow the I-slate to run from small solar panels similar to those used on handheld calculators. Palem said the first I-slates and prototype hearing aids to contain pruned chips are expected by 2013.
While approximate computation may be most readily used for processing data for human sensory input, it could also be useful for certain test-and-confirm type problems (where an approximate test--not even necessarily absolutely excluding false negatives--can filter out a majority of uninteresting results [distributed computing projects tend to work this way, the Large Hadron Collider also uses data filtering which _might_ be amenable to approximation if the false negative probability was sufficiently low]).
Something like a search engine could probably use approximate computation (the result is a list sorted by estimated fitness).
It might be possibly to increase the effectiveness of a safety system by allowing the use of more data and more processing even though the processing is approximate. Likewise, a self-correcting system (e.g., a flight control system) could tolerate minor errors.
Much simulation is effectively approximate (e.g., modelling of the gravity of distant objects as a single point object) already. In addition, measurements are inherently approximate and incomplete, so using approximate computation might be less inaccurate than one might naively expect.
Error detection and correction are also probabilistic, and so might be able to use approximate computation (Lyric Semiconductor's ECC product?).
This reminds me of college courses that described the notion of significant digits. If I take a measurement accuracy of 0.1 , and then mash a number data points together and apply statistical methods, my answer will never be more accurate 0.1 or even less. Computers are stupid with arithmetic and will give me 10 decimal points to the right. And then keep these answers and process all these digits to compute a new answer. Audio and video processing generate data that is interpreted by human analog processes that greatly smooth out the results. Sound is measured with a logarithmic scale. I defy anyone human to detect less than 1 dB of sound pressure. And computers measure millivolts, not dB. We can save a lot of data storage by compressing the raw data reducing the number of bits to manage.
This is very interesting. I think it will have an impact in power consumption of future devices. I don't mind a little less precision for the display or sound, but I hope it will still balance my checkbook correctly.
It is great to see this idea applied to processors!
This idea has applications even for scientific computing, albeit at the shared-memory/message-passing level. The book "Parallel and Distributed Computation: Numerical Methods" by Bertsekas and Tsitsiklis of MIT identifies asynchronous/partially-asynchronous parallel iterative algorithms that converge to correct solutions even if their intermediate results are not exchanged in timely fashion. One could go from there to partially-accurate intermediate results rather than non-timely intermediate results.
Way back in the mid-1990s, we exploited this to propose non-strict cache coherence/message passing for parallel processors, to speed up parallel programs (HiPC 1996: "Program-Level Control of Network Delay for Parallel Asynchronous Iterative Applications", http://dl.acm.org/citation.cfm?id=822137). Today that would help power savings as well! Later we looked at modern applications that leverage genetic algorithms, etc (ICPP 2000, ISCA WSHMM Workshop 1999, etc). Subsequent to our 1996 HiPC paper, folks at Georgia Tech and erstwhile DEC CRL applied the idea to parallel multimedia applications -- they built a system called Beehive I think.
It is great that the idea has now reached the processor level. I recall when I first talked about my version of the idea to a few colleagues in the mid-1990s. They looked at me like I didn't know the basics of science!
You may think that a financial/banking application would always require absolute precision at 32-bit or 64-bit resolution but......
....already some bodies like the U.K. tax are unlikely to bother about calculations to the last penny or pursue people who have got tax returns wrong in the pence column.
In many individual cases that may involve ignoring a relatively low-order significant digit in base 10. Obviously for company accounts the pence is a the least significant of much larger number of significant digits.
Nonetheless for a great number of calculations (just as on your calculator) there are a large number of leading zeros that we insist on adding together to get zero.
@Peter: You may think that a financial/banking application would always require absolute precision at 32-bit or 64-bit resolution but......
I thought that it was mandated by law that software used by financial institution return the EXACT same answers as if you performed the calculation using pencil and paper. And that it was for thsi reason that they still used a form of BCD (binary coded decimal) rather than binary-based floating point...
You may well be right in terms of financial computing.
But my point is that if I accidently make a miss-statement in my tax return in the pence column (which in my case will be quite a significant digit) i doubt that the Her Majesty's Revenue and Customs will come after me; not while they are chasing Google and the Duchy of Cornwall.
David Patterson, known for his pioneering research that led to RAID, clusters and more, is part of a team at UC Berkeley that recently made its RISC-V processor architecture an open source hardware offering. We talk with Patterson and one of his colleagues behind the effort about the opportunities they see, what new kinds of designs they hope to enable and what it means for today’s commercial processor giants such as Intel, ARM and Imagination Technologies.