Intel recently gave a tour inside the workings of its latest graphics processors, and here is what analyst Jon Peddie found out about the x86 giant's latest GPUs.
Intel has been a little cagey about what's in its embedded graphics core, but at its recent developer conference the company gave a deep-dive into its GPU. There's a lot there.
The HD Graphics 5300 is the first product derived from Intel's Gen 8 processor graphics architecture. It appears in the new Core M processors Intel, targeted at tablets and other small form factor devices. The Core M uses a ring bus between CPU cores, caches, and the GPU, with dedicated local interfaces for each connected processor or cache. The ring interconnect is a 32-byte wide bi-directional data bus, with separate lines for request, snoop, and acknowledge, which makes the GPU a first-class citizen.
The graphics technology interface (GTI) is the gateway between the GPU and the rest of the SoC. GTI facilitates communication with the CPU cores, and possibly with other fixed-function devices such as camera imaging pipelines. Intel doubled the write bandwidth from GTI in some versions of the GPU, and implemented coherent shared virtual memory between CPU cores and the GPU.
The architecture is based on so-called execution units (EUs) that have seven threads with each thread consisting of 128 general-purpose registers. Within each EU, the primary computation units are a pair of SIMD floating-point units that support both floating-point and integer computation. Each SIMD FPU can complete simultaneous add and multiply floating-point instructions every cycle.
The EUs are clustered into groups called subslices which are further clustered into slices. These elements are modular building blocks Intel uses to create product variants.
An Intel Gen 8 execution unit consists of seven threads each with multiple general-purpose register files and some supporting architecture-specific registers. (Source: Intel)
Versions of Core M with the Iris Pro 5200 GPU will contain a 128 Mbytes of embedded DRAM which is not in the chip but rather on top of it. Gen 8 graphics use 576 Kbytes L3 cache per slice up from 384 Kbytes in the prior generation.
A new feature for the Gen 8 GPU is global memory coherency between the GPU and the CPU cores. SoC products with the new GPU integrate new hardware components to support Intel Virtualization Technology for Directed I/O. This specification represents an extension of Intel existing method for mapping virtual machines to physical resources.
Intel has executed a major U-turn very gradually in its attitude about the silicon budget for graphics. Intel has always had the engineering talent to design capable and clever GPUs, but marketing, manufacturing, and finance could never see any ROI in it, thinking no one would pay extra for it.
That's changed with each generation since the Gen 6 GPU getting more die space. Now the Gen 8 GPU occupies more than 60% of the die not including an optional external DRAM for graphics. Today you could almost say Intel builds a GPU and sticks a couple CPUs on the side of it.
As for cores, each EU is capable of 16 32-bit floating-point operations per cycle. Some folks would call that 16 cores. By that could the GPU would have 384 "cores."
AMD brags about having 512 GPU cores, Nvidia says they have 192, and Qualcomm won't say but we think they have 24 multi FPU cores. However, when you look at the benchmarks the results don't scale with the core numbers, which leaves me pretty bored with the whole discussion.
Unfortunately, many of these SoCs aren't going into open systems like a PC where the devices could be measured on an even playing field. However, in the case of mobile devices it's not the graphics benchmark score that matters, it's the performance per watt.
— Jon Peddie is President of Jon Peddie Research in Tiburon, Calif., and host of the Virtualize conference.