The On2 VP6 codec
This article assumes a basic understanding of video compression algorithms. For an intro to video coders, see How video compression works.
This article is an introduction to On2's VP6 codec, one of the most widely used codecs on the Internet by virtue of its deployment in Adobe Flash Player. Over nine hundred User Generated Content (UGC) sites (including eight of comScore's Top Ten) use On2 software to encode hundreds of thousands of new Flash videos every day.
The purpose of a video compressor is to take raw video and compress it into a more manageable form for transmission or storage. A matching decompressor is then used to convert the video back into a form that can be viewed. Most modern codecs, including VP6 , are "lossy" algorithms, meaning that the decoded video does not exactly match the raw source. Some information is selectively sacrificed in order to achieve much higher compression ratios. The art of the codec designer is to minimize this loss, whilst maximizing the compression.
At first glance, VP6 has a lot in common with other leading codecs. It uses motion compensation to exploit temporal redundancy, a DCT transform to exploit spatial redundancy, a loop filter to deal with block transform artifacts, and entropy encoding to exploit statistical correlation. However, the "devil is in the details," so to speak, and in this paper I will discuss a few of the features that set VP6 apart.
When is a loop filter not a loop filter?
One of the problems with algorithms that use frequency based block transforms is that the reconstructed video sometimes contains visually disturbing discontinuities along block boundaries. These "blocking artifacts" can be suppressed by means of post processing filters. However, this approach does not address the fact that these artifacts reduce the value of the current decompressed frame as a predictor for subsequent frames.
An alternative or complementary approach is to apply a filter within the reconstruction loop of both the encoder and decoder. Such "loop filters" smooth block discontinuities in the reconstructed frame buffers that will be used to predict subsequent frames. In most cases this technique works well, but in some situations it can cause problems. Firstly, loop filtering a whole frame consumes a lot of CPU cycles. Secondly, when there is no significant motion in a region of the image, repeated application of a filter over several frames can lead to problems such as blurring.
VP6 takes an unusual approach to loop filtering. In fact, some would say that it is not a loop filter at all but rather a prediction filter. Instead of filtering the whole reconstructed frame, VP6 waits until a motion vector is coded that crosses a block boundary. At this point in time it copies the relevant block of image data and filters any block edges that pass through it, to create a filtered prediction block (see Figure 1 below).
(Click to enlarge)
Figure 1. VP6 prediction loop filter.
Because the reconstruction buffer itself is never filtered, there is no danger of cumulative artifacts such as blurring. Also, because the filter is only applied where there is significant motion, this approach reduces computational complexity for most frames. When we first implemented this approach in VP6, we saw an improvement of up to 0.25 db above a traditional loop filter on some clips.