To say that nanometer chip design is "hot" is more than just a metaphor. At 90 nanometers and below, thermal variations are beginning to have an impact on chip power, performance and reliability, opening up a new concern for digital IC designers. For analog and mixed-signal IC designers, thermal concerns are nothing new, but analysis and optimization capabilities have been lacking.
IC thermal-analysis tools that can track thermal gradients across a die are just now starting to emerge, helping designers understand the impact on leakage, performance and IR drop. Thermal-analysis tools for IC packages have been available for some time, and can potentially complement the newer tools that focus on IC substrates.
Designers at Freescale Semiconductor Inc. are seeing thermal gradients at 90 nm, and they're expected to become more significant at 65 nm, said Kamal Khouri, system design methodology manager at Freescale. "Predictions show that thermal effects will impact leakage, performance and electromigration," he said. "My group has focused on looking at the impact thermal effects have on leakage." What they're seeing, Khouri said, is a "superlinear" relationship between subthreshold leakage and temperature variations.
IC designers today typically run multicorner simulations that include worst-case temperature values. But each temperature corner represents a constant chip temperature, and does not account for gradients across the die. With only a worst-case value,
designers may assume that leakage is worse than it really is, said Arvind Narayanan, product manager at Magma Design Automation Inc. With instance-specific thermal values, he said, it may be possible to reclaim power and performance by going to a coarser power mesh, using fewer straps and freeing routing resources.
Opinions differ on when digital IC designers really need to start looking at thermal gradients. Anand Iyer, product-marketing director at Cadence Design Systems Inc., said gradients will probably not be a significant issue until designers are putting out large chips at 65 or 45 nm. But Gary Smith, chief EDA analyst at Gartner Dataquest, noted that he saw his first thermal failure at 130 nm. "There was a 64-bit bus that was heating up during switching," he said. "The critical path passed through the area and the temperature rise caused a timing failure." The IC layout had to be redone, causing a silicon respin, Smith said.
Packing it in
Increasing power densities are the main reason that thermal variations become more of a problem as feature sizes shrink. In a paper given at the 2005 International Conference on Computer Aided Design (ICCAD), authors from the University of Texas and Freescale noted that the power density of high-performance microprocessors has already reached 50 watts/cm2 at 100 nm, and will reach 100 W/cm2 at 50 nm. The paper further noted that low-power design techniques such as clock gating, voltage islands, dual voltage threshold libraries and power gating "can cause significant on-chip temperature gradients and local hot spots."
Rajit Chandra, technical founder of startup Gradient Design Automation, noted that digital ICs can experience as much as a 50°C temperature variation across a die--and even higher in the metal layers. For analog circuits, it gets worse; here, said Chandra, the gradients could be 100°C. And yet, he noted, even a 4°C gradient across a bandgap reference circuit can cause a malfunction.