Design Article

IMG1

3D graphics hardware IP uses OCP bus interface

Eisaku Ohbuchi
Hardware Engineer,
DMP Inc.

7/13/2007 3:00 AM EDT

Graphics in embedded systems such as user interfaces and games continue to evolve and improve, with enhancements such as moving from 2D based to 3D based interactive graphics. PlayStation Portable (PSP), for example, has adopted the PlayStation 2 class 3D graphics quality into the handheld device. On the other hand, in PC graphics, the programmable shader scheme, where developers can configure functionality in the vertex and fragment level, has been the major approach on DirectX and OpenGL API infrastructure, and this kind of hardware creates extremely rich content and experiences on consoles such as the Xbox360 and PlayStation 3. The graphics in embedded systems, however, create many issues and challenges that developers must address, such as low power consumption for keeping long battery life, minimized system components due to space constraints, and limited gate count for low cost.

The Khronos Group is defining the various media APIs for the embedded space, and has released a graphics API, OpenGL ES. At present, this OpenGL ES is releasing two versions 1.x and 2.x, for fixed graphics pipeline and programmable graphics pipeline, respectively, and this approach is the same as in PC graphics.

DMP graphics cores overview
DMP provides scalable high performance low power 3D graphics cores to the embedded space for handheld devices, mobile phones, vehicle navigation systems, amusement game console, and all other embedded graphics spaces. PICA200 is the latest 3D graphics IP core covering all the previously mentioned application areas.

The core consists of several components including OpenGL ES standard functions and our own original graphics technology. The component is built per customer requirements and target systems with custom components such as performance, memory bandwidth and power consumption. It is difficult to assume what the customer demands for such a wide range of applications will be when choosing an IP interface scheme. DMP has been extremely successful in adopting the Open Core Protocol (OCP) as the standard bus interface for our components.


Figure 1: PICA200 demonstration scene developed by Futuremark and DMP collaboration

Maestro technology
The graphics core has achieved high performance with low power consumption by employing the OpenGL ES 1.1 and our own extension graphics API called Maestro. These Maestro functions include frequently use and good looking graphics functions in target applications that include various lighting and shading models such as Phong, Cook-Torrance, and BRDF, shadow effect, polygon subdivisions, and procedural texture.

Next: Maestro functions, lighting, shadow and particle effects
Maestro functions are implemented as hard-wired logic with our modified original algorithms to break through the embedded system design conflict: Low power consumption vs. High performance. Maestro functions can make extremely rich contents, like PC and console graphics, on handheld devices (Fig. 1).

The Maestro functions include the following effects.

  1. Lighting Maestro -- includes high-performance per-fragment lighting function and supports various shading models such as Phone, Isotropic/Anisotropic BRDF, and subsurface scattering.


  2. Figure 2: Rendering result with OpenGL ES only (left) and with OpenGL ES + our Maestro API (right)

  3. Shadow Maestro -- supports Hard and Soft shadowing in real-time


  4. Figure 3: Real-time Soft-Shadow rendering with our Shadow Maestro API

  5. Figure Maestro -- reduces the memory bandwidth by generating the fine polygons in the hardware such as NURBS and polygon subdivisions, so the input data size could be minimized.


  6. Figure 4: Polygon subdivision example using Figure Maestro. Left image shows the input control polygons from host CPU and right image shows the generated polygon by our graphics hardware on the fly.

  7. Mapping Maestro -- supports bump mapping and procedural texture, and the procedural texture doesn't need any memory access to make texture image, because the image is generated based on mathematical equation.


  8. Figure 5: Mapping Maestro example (Left: Bump mapping with per-fragment lighting and Right: Wooden pattern with our procedural texture hardware without any texture memory access.)

  9. Particle Maestro -- generates the fog, cloud, gas effects in hardware.


  10. Figure 6: Particle Maestro can produce fog, cloud and gas effect, and the rigid and fuzzy objects are composted without any artifact in our rendering algorithm.

Next: Hardware block diagram

These Maestro technologies have been realized by algorithm optimization and long term study in several research institutes. We are introducing PC level graphics and experiences to embedded systems.

Hardware block diagram
The following is a block diagram of the graphics core.

All red arrows in Fig. 7 indicate the memory bus interface which adopts OCP.


View full size
Figure 7: PICA200 block diagram.

During development of the PICA 200 core the following challenges were addressed:

1)The IP core must support a wide variety of applications running the gamut from mobile phones with tiny displays to the amusement console with large displays in order to cover all embedded systems,

2) 3D graphics hardware needs huge bandwidth for command, texture, color and z buffer read/write access and this bandwidth will determine the performance of 3D IP core, and

3) IP core should be easy to integrate in an SoC system environment.

To solve all of these issues we decided to adopt OCP as the standard interface infrastructure of our building block scheme, and as a result, we can provide the following options to meet the customer requirements (Table 1).


View full size

Table 1: A part of PICA200 building block options

Using a mobile phone system as an example, OpenGL ES functions and VGA display size support is required, and power consumption should be minimized, so in this case, the number of vertex processors and texture pipelines needs to be optimized as two and two, respectively (of course these numbers are just one example case). The Maestro functions were not implemented to only support Open ES, since there is no need for non-standard functionality in a large number of cell phones. The texture cache parameter also can be optimized for the characteristics of the SoC bus due to the fact that the parameters are automatically provided by the OCP standard. In other applications such as an amusement game console, all Maestro functions would be required components to support higher quality attractive contents with high performance and large display size.

We also utilized the pre-fetching mechanism utilizing the thread protocol of OCP. This is very important to avoid stall of rendering pipeline to keep the high rendering performance. Table 2 shows the example of thread ID assignment in this graphics core, and in this case, the core has four texture modules.


Table 2: Thread ID for each kind of transaction

OCP2.2 supports tagging which permits out-of-order response, but this particular DMP core does not support the out-of-order handling, because the access for color and depth buffer need to use read-modify-write lock based access and the access for the other doesn't have logic and FIFO for out-of-order access support to make a small IP core and support the wide range of applications mentioned above. To obtain better memory access without the tag function, this graphics core is optimized to block based rasterization, where all pixel generation from triangles is done by rectangle block such as 4x4, to utilize the memory access by keeping long burst length access and data access with aligned address, and obtaining the good cache hit rate for texture and color buffer.

Next: OCP benefits, SIGGRAPH


The biggest benefit to using OCP is that the widely used industry standard is open and available to anyone. In addition most major SoC venders utilize and support OCP. As a result, we can specify some parameters in our interface and cache based on the protocol to optimize our customer bus access interface and system characteristics as in the last row of Table 1, and this provides for a high level of building block concepts for both the IP provider and SoC vender.

Conclusion
This core was first released at SIGGRAPH 2006 (Fig. 8), and the PICA200 building cores are now available.

We have been developing 3D graphics technology for the embedded space for many years and can reduce the time and cost of developing this technology by using OCP. System integration work is very time consuming work for both IP core and SoC venders. The PICA200 core not only supports various embedded applications, but also optimizes the performance for each system by utilizing the OCP based building block scheme. OCP provides the complete specification and infrastructure necessary to meet all of the design challenges mentioned above.


Figure 8: Demonstration of FPGA prototype at SIGGRAPH 2006

About the author
Eisaku Ohbuchi works as hardware engineer of 2D/3D graphics hardware IP at DMP. He worked at NEC and NEC Electronics for three years on the development of application processors for mobile phone and image processing hardware cores. For two years he worked on development of the PICA(TM) series of graphics accelerators. He can be reached at eisaku.ohbuchi@dmprof.com.


print

email

rss

Bookmark and Share

Joinpost comment




Please sign in to post comment

Navigate to related information

Most Popular

Product Parts Search

Enter part number or keyword
PartsSearch


FeedbackForm