OpenCL implements a control-slave architecture, where the host processor
(on which the application runs) offloads work to a computing resource.
When a kernel is submitted for execution by the host, an index space is
defined. The index space represents the set of data that the kernel will
be applied to. It can have 1, 2 or 3 dimension (hence the name of
NDRange, or N-dimensional range). The instance of a kernel executing on
an individual entry in the index space takes the name of work-item. Work
items can be grouped into work-groups, which will execute on a single
Kernels can be compiled ahead of time and stored in
the application as binaries, or JIT-compiled on the device, in which
case the kernel code will be embedded in the application as source (or a
suitable intermediate representation). The kernel can be compiled to
execute on any of the supported devices in the platform.
application developer defines a context of execution, which is the
environment the OpenCL C kernels execute in. The context includes the
list of target devices, associated command queues, the memory accessible
by the devices and its properties. Using the API, the application can
queue commands such as: execution of kernel objects, moving of memory
between host and processing plane, synchronization to enforce ordered
execution between commands, events to be triggered or waited upon, and
The architecture of the Renderscript API is analogous to OpenCL.
enables general purpose computing to be carried out on the GPU. The ARM
Mali-T600 series of GPUs has been specifically designed for general
purpose GPU computing, and an OpenCL 1.1. Full Profile DDK as well as a
Renderscript DDK is available from ARM.
At this year’s ARM Technology Conference in Santa Clara, I will present two pieces on this topic. My first presentation will focus on OpenCL and how this is
enabled on ARM based systems. I will discuss key aspects of the OpenCL
architecture and how the API is used, as well as highlights of the
OpenCL C programming language through some example code.
second presentation I will dive into more detail on GPU Computing to explore some practical use cases and how the design considerations
of the ARM Mali-T600 series of GPUs make them the perfect fit for
compute frameworks such as OpenCL and Android Renderscript.
aim of these sessions is to help you to understand the applicability of
GPU Computing and how you can get started and explore it yourself.
David Patterson, known for his pioneering research that led to RAID, clusters and more, is part of a team at UC Berkeley that recently made its RISC-V processor architecture an open source hardware offering. We talk with Patterson and one of his colleagues behind the effort about the opportunities they see, what new kinds of designs they hope to enable and what it means for today’s commercial processor giants such as Intel, ARM and Imagination Technologies.