SAN JOSE, Calif. – Advanced Micro Devices tipped a few details about its next-generation graphics processor to be released later this year in a rivalry that also could illuminate competition in process technologies and DRAM stacks. Polaris uses a new architecture made in a 14nm FinFET process to deliver nearly twice the performance per Watt of its prior-generation GPU.
The chip, also known as AMD’s Graphics Core Next (GCN) 4.0, was shown running at less than 86W compared to 152W for a current GPU. Both chips were running the same videogame at 60 frames/second and 1080-progreessive resolution. AMD will demo the chip at CES this week and has already sampled it to some customers.
Presumably AMD is using a 14nm process at Globalfoundries, formerly its internal chip-making division. Samsung developed the 14nm process which comes in two flavors, an early-to-market and a performance optimized version.
An AMD spokesman said the company is "working with multiple foundries on our FinFET products, including GlobalFoundries."
David Kanter, a senior analyst with the Linley Group, said he expects AMD waited to use the optimized second version of the 14nm process. "I think the goal of the first process was to serve Apple and Samsung and I expect they locked up almost all the wafers," Kanter said.
Overall, Kanter expects AMD will make evolutionary but not revolutionary changes in all the graphics-related subsystems of Polaris, in part because AMD's chips are in all the current videogame consoles as well as a hefty share of PCs. "There’s a lot of benefits in sticking with what works in software optimized for developers,” he said.
AMD made the disclosure in a YouTube video (below). In the video one architect said Polaris uses a new architecture. It supports deeper instruction buffers to improved single-thread performance and features tighter clock and power gating than previous chips, he said.
Polaris also sports improvements in its instruction pre-fetch engine, memory compression, primitive discard accelerator, scheduler, and shader efficiency. It will support HDMI 2.0a, Display Port 1.3, and a hardware codec for H.265 including 4K encoding 60 frames/second.
The company did not comment on the memory architecture of the chip. Last year, AMD released its first graphics chip using the High-Bandwidth Memory stacked DRAMs of Hynix to increase the capacity and memory bandwidth of a GPU.
Nvidia is expected to release this year its Pascal chip, likely using a TSMC 16nm FinFET process and the Hybrid Memory Cube stacked DRAM technology from Micron. One source said the Pascal chip delivers up to 12 TFlops performance.
In a brief encounter in June, Nvidia’s chief executive Jen-Hsun Huang said he aims to use chip stacks with up to 32 Gbytes memory compared to the Fiji part AMD announced in June using 4 GBytes DRAM.
"At this time we’ve only publicly demonstrated a GDDR5 configuration of the Polaris architecture," an AMD spokesperson said. "It’s important to understand that HBM isn’t (currently) suitable for all GPU segments due to the current HBM cost structure. In the mainstream GPU segment, GDDR5 remains an extremely cost-effective, efficient and viable memory technology," the spokesperson added.
“For today’s graphics 4-8 Gbytes is more than sufficient. Even with 4K resolution video, by the time you run out of memory you have run into other [more basic performance] problems,” said Kanter of Linley Group.
The analyst suspects Nvidia will use HBM, in part to avoid splitting resources of graphics developers already working on HBM for AMD’s product. He believes Pascal will use a stack with 32 Gbytes HBM memory and a TByte/second bandwidth in part to leapfrog its rival Intel in high-performance computing.
Intel’s Xeon Phi currently uses 16 Gbytes DRAM and is gaining ground as a co-processor in supercomputers. Intel is expected to bolster its product using Altera FPGAs perhaps announced as early as the International Supercomputer Conference in June. Nvidia could get ahead of Intel’s news by announcing Pascal at its own annual conference in April. One unknown is how Nvidia will support the cyclic redundancy checks and other high end features built into Micron’s HMC.
Once the new chips hit testing and teardown labs later this year, they likely will give new insights into the relative performance and robustness of competing FinFET processes and DRAM stacks.
“AMD has had the lead with stacked memory, it’s a horse race as to which company will be able to show the first production FinFET part,” said Jon Peddie, principal of Jon Peddie research (Tiburon, Calif.). “I can’t really make a comparison since neither company has released any specifics about their products, and it’s doubtful anything concrete will come out at CES,” he added.
Another graphics expert who asked not to be named said Polaris appears to be optimized for cost, targeting the videogame market while Pascal is optimized at performance, targeting general-purpose GPU computing in systems such as supercomputers.
“From what I can gather about Pascal, it will be quite large and will probably have yield/cost issues...a smaller [AMD] chip would certainly be cheaper to make, especially in a new [14nm] process,” the expert said.
Kanter noted both Polaris and Pascal refer to families of processors that will include a range of low- to high-end parts optimized for various markets.
AMD suggested Polaris also sports an improved display engine, geometry processor, L2 cache and memory controller but provided no details on them. (Image: AMD)
— Rick Merritt, Silicon Valley Bureau Chief, EE Times