News & Analysis
Comment
danny1024
Perhaps a magic information fairy pointed him to an Intel Technology Journal ...
rick.merritt
Sounds similar to what Huawei/Altera are doing and what I expect other comms and ...
Inside Intel’s Haswell with tour guide David Kanter
Rick Merritt
11/14/2012 2:01 AM EST
SAN JOSE, Calif.--Want to know what’s inside Intel’s next-generation microprocessor? Ask David Kanter. The CPU blogger just published a deep dive on Haswell, Intel’s first chip designed for its 22-nm FinFET process.
Haswell will emerge next year, probably ahead of most of the 64-bit ARM chips and in tandem with AMD’s next generation cores such as Steamroller. Ultimately, it will appear in everything from tablet SoCs to server CPUs.
Kanter describes Haswell as “a dual-threaded, out-of-order microprocessor that is capable of decoding five instructions, issuing four fused micro-ops and dispatching eight micro-ops each cycle.”
At this stage, Kanter could only do a paper analysis of the microarchitecture. It will be many months before we see test results on working Haswell chips.
That said, Kanter was able to make some interesting high-level projections based on his deep dive into the workings of the chip. “We estimate that a Haswell core will offer around 10 percent greater performance for existing software, compared to [Intel’s current] Sandy Bridge [processors, and] for workloads using the new extensions, the gains could be significantly higher,” he said.
In theory, some instruction set extensions could double performance on some jobs, and a new transactional memory feature could provide 30 percent gains on other operations, he said.
Measured next to its traditional x86 competitor, “Intel is already far ahead of AMD in terms of CPU performance,” writes Kanter. “The performance gap should narrow given the scope of opportunities for AMD to improve, but Haswell will continue to have significant advantages.”
Haswell will come in 10W versions for tablets where it will compete with 4W ARM-based SoCs, Kanter added. We will need to wait for working silicon to know relative performance/Watt efficiency of Haswell against the ARM chips, he said.
Indeed, there will be plenty of tales to tell once real chips get out of the lab next year. Until then, microprocessor aficionados can enjoy Kanter’s block-by-block tour of the architecture.
Related stories:
Haswell will emerge next year, probably ahead of most of the 64-bit ARM chips and in tandem with AMD’s next generation cores such as Steamroller. Ultimately, it will appear in everything from tablet SoCs to server CPUs.
Kanter describes Haswell as “a dual-threaded, out-of-order microprocessor that is capable of decoding five instructions, issuing four fused micro-ops and dispatching eight micro-ops each cycle.”
At this stage, Kanter could only do a paper analysis of the microarchitecture. It will be many months before we see test results on working Haswell chips.
That said, Kanter was able to make some interesting high-level projections based on his deep dive into the workings of the chip. “We estimate that a Haswell core will offer around 10 percent greater performance for existing software, compared to [Intel’s current] Sandy Bridge [processors, and] for workloads using the new extensions, the gains could be significantly higher,” he said.
In theory, some instruction set extensions could double performance on some jobs, and a new transactional memory feature could provide 30 percent gains on other operations, he said.
Measured next to its traditional x86 competitor, “Intel is already far ahead of AMD in terms of CPU performance,” writes Kanter. “The performance gap should narrow given the scope of opportunities for AMD to improve, but Haswell will continue to have significant advantages.”
Haswell will come in 10W versions for tablets where it will compete with 4W ARM-based SoCs, Kanter added. We will need to wait for working silicon to know relative performance/Watt efficiency of Haswell against the ARM chips, he said.
Indeed, there will be plenty of tales to tell once real chips get out of the lab next year. Until then, microprocessor aficionados can enjoy Kanter’s block-by-block tour of the architecture.
Related stories:
Navigate to related information


resistion
11/14/2012 8:28 AM EST
Some hints on the web that Haswell will have off-chip DRAM as L4: http://news.softpedia.com/news/Intel-Haswell-Graphics-Better-Due-to-4th-Level-On-Package-Cache-259366.shtml
Sign in to Reply
rick.merritt
11/14/2012 10:00 AM EST
Interesting and something I have never heard of in a CPU before--an L4 cache, let alone and off-chip one.
Sign in to Reply
resistion
11/14/2012 12:13 PM EST
It's still a little rumor-like to me. I also saw Anand report it as embedded DRAM as if on-chip, but if it is off- chip, would they use TSV for the speed? Isn't the normal course to make it on-chip SRAM?
Sign in to Reply
SylvieBarak
11/14/2012 2:19 PM EST
Can't wait for Haswell tablets! Now THOSE will be tablets worth buying in my opinion...
Sign in to Reply
Doug S
11/14/2012 4:19 PM EST
You sure about that Sylvie? 10 watts seems like way too much for a tablet, it'll either burn you or require a fan. There won't be a lot of buyers for a tablet with a fan, even if it is more powerful.
Sign in to Reply
markhahn
11/14/2012 4:49 PM EST
stacking memory on the CPU makes a lot of sense for any lower-power chip - after all, it's pretty routine in phones. stacking not only gives a performance boost, but saves some power. probably hard to do with a bigger/hotter chip, though.
10W is certainly workable for a tablet, as long as it can race to sleep, low leakage, etc.
I just wish AMD would grow some balls and produce, for instance, an APU with stacked dram so you could tile a bunch of them onto a board. whatever happened to the idea of scalable multiprocessor systems anyway? (with builtin scalable GPU for free!)
Sign in to Reply
chipmonk
11/14/2012 5:39 PM EST
Haswell is going to be a 2.5 d module with the Level 4 cache chip next to the processor, the chips connected by fine-pitch high-density thin film interconnects on the Si substrate of the module. Will have lots of interconnects, enabling lots of parallelism in memory accesss by multi - core in CPU / SoC. BTW won't be able to stack chips ( true 3D ) because need to take heat out of the 10 watt CPU.
Sign in to Reply
rick.merritt
11/15/2012 11:23 AM EST
That's big news @chipmonk. What's your source on it?
Sign in to Reply
chipmonk
11/15/2012 12:04 PM EST
informed guess - must leave it at that !
Sign in to Reply
rick.merritt
11/16/2012 11:55 AM EST
Sounds similar to what Huawei/Altera are doing and what I expect other comms and server OEMs will try out over the next year or so.
http://www.eetimes.com/electronics-news/4401446/Huawei--Altera-mix-FPGA--memory-in-2-5-D-device
Sign in to Reply
danny1024
11/19/2012 10:11 PM EST
Perhaps a magic information fairy pointed him to an Intel Technology Journal Article
"TERA-SCALE MEMORY CHALLENGES AND SOLUTIONS"
http://download.intel.com/technology/itj/2009/v13i4/pdfs/ITJ9.4.7_MemoryChallenges.pdf
And this corresponding patent:
"Systems, methods, and apparatuses for hybrid memory"
http://www.google.com/patents/US20110161748
Sign in to Reply