SEATTLE--SUPERCOMPUTING 2011--Perhaps the most poorly kept secret at SC11 was IBM’s official unveiling of its next generation Blue Gene/Q (BGQ) supercomputer, the third generation in its Blue Gene family, with 16 multi-processing core technology and a scalable peak performance of up to 100 petaflops.
Unveiling the BGQ on the SC11 show floor, IBM’s director of STG HPC offerings Jim Herring said the system was the most energy efficient and reliable system in the Blue Gene lineup thus far, having already hit the number one spot of the TOP500 green machine list due to its ability to produce 2 gigaflops per watt.
Designed to have a small footprint and low power requirements, the BGQ boasts low latency and high performance runs that simplify tracing errors and tuning performance, based on an open source and standards-based operating environment.
The BGQ has also been engineered to contain fewer moving parts, to make it more reliable.
Based on IBM’s PowerPC A2 processing architecture, each of the BGQ’s 64-bit processors sports 16 compute cores, four times the number of cores used in the previous Blue Gene/P system, with each CPU able to handle four threads simultaneously.
IBM said the system also had an additional core to run the operating system administrative functions and a redundant spare core. Each processor sports 32 KB of L1 cache, divided equally between data and instruction, while the L2 consists of 32 MB of embedded DRAM.
A full BGQ rack would contain 1024 nodes, or 16K cores.
The memory and I/O controllers are integrated onto the chip itself and the I/O has been separated from the server nodes so that configurations can scale compute and I/O independently.
A BGQ rack can apparently accommodate between eight and 128 I/O nodes, which use the same Power A2 chip as the compute servers.
The firm is also touting the BGQ’s hardware-based speculative execution capabilities, which it says makes multi-threading for long code sections simpler, even if that code has potential data dependencies. “If conflicts are detected, the hardware can backtrack and redo the work without affecting application performance,” said the firm.
To make sure programmers avoid the potentially complex integration of locks, or bottlenecks caused by deadlocking, IBM has additionally added in some hardware-based transactional memory for multi-threading.
A new feature in this iteration of Blue Gene is the addition of a 5D Torus which uses fiber optics for server-to-server communication at up to 40 gigabits per second, four times faster than Blue Gene/P’s interconnect.