SAN FRANCISCO—Nvidia Corp. said Monday (Feb. 28) that the latest version of its CUDA toolkit for developing parallel applications using the company's graphics processors includes new features designed to make parallel programming easier and enable more developers to port their applications to graphics processor units (GPUs).
CUDA 4.0 includes Nvidia's GPUDirect 2.0 technology, which enables easier and faster multi-GPU programming and application performance, as well as unified virtual addressing technology to enable quicker and easier parallel programming, Nvidia (Santa Clara, Calif.) said. CUDA 4.0 also includes a collection of open source C++ parallel algorithms and data structures that ease programming for C++ developers, the company said.
"Unified virtual addressing and faster GPU-to-GPU communication makes it easier for developers to take advantage of the parallel computing capability of GPUs," said John Stone, senior research programmer, University of Illinois, Urbana-Champaign, in a statement released by Nvidia.
The CUDA 4.0 architecture release includes a number of other key features and capabilities, including MPI Integration with CUDA applications, multi-thread sharing of GPUs and multi-GPU sharing by single CPU thread, Nvidia said.
A release candidate of CUDA Toolkit 4.0 will be available free of charge beginning March 4, Nvidia said. The release is available to developers who enroll in the CUDA Registered Developer Program, the company said.
David Patterson, known for his pioneering research that led to RAID, clusters and more, is part of a team at UC Berkeley that recently made its RISC-V processor architecture an open source hardware offering. We talk with Patterson and one of his colleagues behind the effort about the opportunities they see, what new kinds of designs they hope to enable and what it means for today’s commercial processor giants such as Intel, ARM and Imagination Technologies.