I was intrigued by Marvell CEO Sehat Sutardja's call in EE Times to "change and rethink Moore's Law to include the long-ignored fourth dimension" of power consumption efficiency. "What we need now is a new social contract," Sutardja wrote.
Calling Moore's Law a social contract is one way to look at it. Others see it as a self-fulfilling prophecy, or at least it has been so for the past 45 years. Ray Kurzweil claims it is a part of the Law of Accelerating Returns, whereby computing devices have been consistently multiplying their computational power at least since 1890 and possibly for centuries before that. I take Kurzweil's optimistic view; my rationale is that better computing power created in one generation enables us to develop a better computer with the next generation, thereby creating a positive feedback loop for an exponential growth of computing and related domains.
In the famous 1975 IEDM paper that begat Moore's Law, Gordon Moore predicted the annual doubling of chip complexity as a result of three trends, only one of which was scaling. Despite that, over the past two decades Moore's Law manifested itself mainly as a 0.7 scaling for every process node, yielding the full factor-of-two density improvement on its own.
The early days of scaling were the most rewarding. As Moore stated: "By making things smaller, everything gets better simultaneously. There is little need for trade-offs. The speed of our products goes up, the power consumption goes down, system reliability, as we put more of the system on a chip, improves by leaps and bounds, but especially the cost of doing things electronically drops as a result of the technology" (Moore, SPIE 1995). Those were the good old days. There was no need to call for a new social contract; power efficiency came naturally with scaling, as smaller transistors had smaller gate capacity and burned less dynamic power. And further efficiencies were achieved by lowering the operating voltage all the way from 5 volts to less than a volt.
But going forward with just scaling does not look as bright. Further reduction of operating voltage will cause severe reductions in performance, and further reductions in gate capacitance will have only a negligible impact on dynamic power (interconnect capacitance these days far exceeds the gate capacitance). While lithography scaling provides all the benefits with respect to the transistors, it provides none with respect to interconnect; in fact, it just gets worse. The industry had moved from two metal layers all the way to 10 metal layers, then from aluminum to copper, and recently from the convenience of SiO2 to the challenging low-k dielectric, with some even predicting air in the future. Yet the tyranny of interconnects requires us to consider other alternatives.
It is an exciting time to be in the semiconductor industry because you get to see innovations and feats of engineering almost at at annual rate. Every year, people find ways to make circuits smaller and more efficient, maybe being power efficient is the way to continue Moore's Law.
Simon - http://www.starrausten.com
Now that we have change our name to MonolithIC 3D Inc. we have update our web site and put a lot of interesting information including a very active blog http://www.monolithic3d.com/blog.html. I highly recommend going through it as many of the issues discussed above a re covered in details there.
Props to @3D Guy, @Robotics Developer, and @kdboyce for getting at the heart of it.
"It's the power dissipation, stupid!"
Stacked die, monolithic 3D, and TSV all have their own unique benefits & tradeoffs, but they all have one challenge in common: power dissipation, especially from the "meat" layers between the "bread". Flip chip packages were a revolution over wirebond because they allowed better thermal dissipation away from the board, instead of injecting the heat into the board and causing warpage, thermal expansion, depopulation, etc, and you can attach a big fat heat sink to the contact surface to draw heat away. What do you do when you can no longer draw heat away equally from all layers?
There needs to be a new set of design rules for power dissipation on the internal layers, and as @RD and @3D hinted at, there needs to be some floorplanning guidelines to follow as well. EDA companies will have so much fun selling tools that can model and do STA on that, while dealing with multi-voltage and multi-temperature layers. You thought PVT and OCV analysis were nasty now, just wait. :)
For any 3D implementation to work, we'll probably need to bring in a mechanical engineer or materials engineer in on the IC Design teams in the future, to handle the new form factor, thermal expansion/contraction issues, heat distribution, and power distribution across multiple layers.
All of this means good news for us in the EE community, as there's still lots of interesting work to be done for many years to come unlocking the potential of these breakthroughs.
:sarcasm Now if you'll excuse me, I've got some patents to file on this so I can sell them to a troll so nobody can benefit from this except lawyers. end:sarcasm
- In high performance chips, eg. those used in servers, 3D will first be used to stack memory atop processors, which doesn't increase power density. Over time, as 3D becomes more mainstream, people will start stacking logic above logic, to save power, reduce cost (and/or) improve performance.
== I expect they will deal with heat removal in 3D stacked high performance chips in many ways: (1) Floorplan blocks in 3D such that a high power density block is stacked above a low power density one (2) Use dense power grids which transfer heat from various dice to the heat sink (3) Servers are moving away from using 130W individual die to using lower power cores (45W-50W) and using many more of them... this trend makes cooling easier. == In terms of tackling power delivery issues, it is less of a challenge. Companies are using many many smart techniques: (1) In multicore chips, each core has its clock referenced to its own individual power grid. eg. When the Vdd grid goes from 1V to 0.9V due to noise, the clock frequency is slightly lowered, so one doesn't need to leave so much margin for noise. (2) Freescale uses on-chip stacked capacitors that provide huge amounts of decoupling. IBM uses trench caps for eDRAM and these provide large amount of cap, which are also used for decoupling supply noise. (3) In multicore chips, when a core is shut down, its frequency is lowered slowly from 1GHz to 750MHz to 500MHz to 0MHz. This reduces the amount of supply noise. (4) There are many other solutions to the supply noise problem... the list is too long to discuss.
- In monolithic 3D, the connections between dice are very short, so their performance/power penalties are negligible.
These are good questions. Whenever you pack more components in the same area (eg. with 3D or by scaling), you always have to deal with higher power densities and heat removal issues.
- There are many applications, eg. chips used in mobile phones and tablet computers, where power consumption is less than 1W. In these (huge) markets, power delivery and heat removal are less of an issue... so the first penetration of 3D will be there. It helps that these markets are projected to be the biggest markets for semiconductors in the next 10 years.
(to be continued in next post due to number of words restriction)
I think that 3D is an interesting idea! I am wondering how we get enough power into the multiple devices and consequently get the heat out? Is there any cost (speed/power) associated with the connections between the "dies"? I can't help but wonder if the operating speeds and the geometries will conspire to cause system level timing issues (ringing, reflections, slow rise-times, etc). I can remember doing 3D spice simulations on packing interconnects for earlier high-speed interfaces and the difficulty with getting on/off chip with good enough signal integrity. Are there issues (or have they been solved) with the 3D stack interconnects?
Monolithic 3D chips are the way of the future. But as with all developments, it will finally be the applications that will drive the form and features thereof and dictate how the chips must be made.
The key problems now are getting rid of heat and how to efficiently handle the necessary interconnects internally and externally without killing the area/volume advantages 3D stacking of similar or dis-similar wafers can provide.