I just heard from Dave Strenski from Cray Supercomputers that he (along with Prasanna Sundararajan and Ralph Wittig from Xilinx) recently released an updated version of their classic work on comparing the peak performance of 64 bit floating-point calculations between FPGAs and microprocessors.
The latest version of this paper was published a week or so ago on the HPCwire Website. The following is the intro from this paper, reproduced here with the kind permission of the edtor of HPCwire:
Three years ago an article was published in HPCwire showing a method for comparing the peak performance of 64 bit floating-point calculations between FPGAs and a microprocessor. The article showed that the theoretical peak performance of the Virtex-4 LX200 was about 50 percent better than the then-current dual-core processor.
A follow-up article in HPCwire in 2008 refined these calculations, adding more detail to account for placement and routing issues in the FPGAs and using the latest release of the floating-point cores from Xilinx. These refined calculations compared three Virtex-5 FPGA devices against the then current quad-core microprocessor.
That article showed that not only were the newer FPGAs faster than the quad-core processor, but that the gap in performance was getting larger.
In 2009, six-core microprocessors were released and Xilinx released several new Virtex-6 FPGA devices. Recalculating the performance of all these devices shows that this gap in performance between the FPGAs and microprocessors continues to grow….
Dr DSP, I fully agree that this study should be expanded to GPUs and DSPs. Just have not found the time yet. These theoretical designs also need to be placed and routed to make sure my assumptions match reality. You're also correct that the external memory connected to the FPGAs can play an important role in the true performance, but which FPGA board design do I use? This quickly grows into a large can of worms.
This study should compare GPUs too! The memory interface speed can make a big difference too. FPGAs really need lots of external memory to keep up with the processing. I hope that is included somehow as well.
David Patterson, known for his pioneering research that led to RAID, clusters and more, is part of a team at UC Berkeley that recently made its RISC-V processor architecture an open source hardware offering. We talk with Patterson and one of his colleagues behind the effort about the opportunities they see, what new kinds of designs they hope to enable and what it means for today’s commercial processor giants such as Intel, ARM and Imagination Technologies.