Video transcoding is the problem of converting from one compressed video format to another by decoding to raw video frames and then re-encoding in the new format. In many applications efficient transcoding is crucial. For example, in order to support video-on-demand streaming, often video will be stored in one master format to save space but many different viewing devices and decoders will have to be supported. This can be done by transcoding the data just before it is delivered, at real-time or faster-than-real-time rates. In production, to edit video it is necessary to decode it, modify it, and re-encode it. In the home, to use video on a home video server, the video may have to be transcoded in order to support the formats supported by the server.
In this article, we discuss the design of a high-performance video encoding system, developed by using the RapidMind Multi-Core Development Platform, which can exploit both multi-core CPU and many-core GPUs for faster-than-real-time performance. Both the issues arising from the complexity of video encoding and the benefits and challenges arising from use of multi-core and many-core processors are discussed.
Supporting high-definition video-on-demand requires a high-performance transcoding solution. RapidMind has developed a software development platform that makes it possible to exploit the performance of a variety of multi-core processors using a unified parallel programming model. By building a transcoder on top of the RapidMind platform, this application is now able to run on a variety of processors, including CPUs, GPUs, and the Cell BE, and can also scale up to future multi-core (and many-core) processors as they become available.
By its nature, a transcoder needs to support a variety of video compression formats. However, many formats have similarities in the types of computations required to implement them. Also, encoders usually are much more expensive than decoders. Typically a video standard only specifies what data will be stored in the compressed data stream and how the decoder will interpret it. It does not specify how the encoder extracts the needed information from the raw input stream.
Typically a compressed video format will require both compression of single frames and the prediction of in-between frames using nearby frames in the video sequence. Some frames are compressed without reference to other frames in order to enable recovery from any errors in transmission and to allow the user to start decompressing in the middle of the sequence.
Next: DCT single frame compression