Design Article

IMG1

De-interlacing with HQV high quality video processing

Jed Deame
Teranex Business Unit,
Silicon Optix, Inc.

9/30/2005 12:47 AM EDT

Most video sources, including DVD, standard-definition TV, and 1080i high-definition TV, transmit interlaced images. Instead of transmitting each video frame in its entirety (what is called progressive scan), most video sources transmit only half of the image in each frame at any given time. This concept also applies to recording video images: video cameras and film-transfer devices record only half of the image in each frame at a time.

The words "interlaced" and "progressive" arise from the days of CRT or "picture-tube" televisions, which form the image of each frame on the screen by scanning an electron beam horizontally across the picture tube, starting at the top and working its way down to the bottom. Each horizontal line "drawn" by the beam includes the part of the picture that falls within the space occupied by that line. If the scanning is interlaced, the electron beam starts by drawing every other line (all the odd-numbered lines) for each frame; this set of lines is called the odd field. Then it resets back to the top of the screen and fills in the missing information, drawing all the even-numbered lines, which are collectively called the even field. Together, the odd and even fields form one complete frame of the video image.

Because all CRT televisions worked in this manner until very recently, the signal transmitted to them was designed to send the odd lines followed by the even lines. This matched the capabilities of the display device, and it also cut in half the amount of information that had to be sent in a given amount of time; in other words, it reduced the transmission bandwidth by a factor of two, which was good news for broadcasters. (Some modern CRT televisions are capable of scanning each complete frame from top to bottom in a single pass, and thus are said to perform progressive scanning.)

Today, high-definition video displays use digital technologies, such as DLP, LCD, LCOS (including variants SXRD and D-ILA), and plasma, which dominate the television landscape. Instead of "drawing" lines of picture information on the screen, these technologies form images with an array of pixels, and each frame is displayed in its entirety all at once; in other words, all pixels are activated simultaneously to form the complete image rather than forming the image line by line as CRTs do with scanning.

Even so, the video signal that determines what these devices will display is still interlaced or progressive; that is, the information is sent from the source either half a frame or one complete frame at a time. In fact, digital displays ultimately require a progressive signal to operate properly, so if they receive an interlaced signal, it must be converted to progressive before it can be displayed.

Thus, translating the interlaced video signal from DVD and 1080i sources into progressive format is required by all digital displays. This is the job of a video processor, and the process itself is called de-interlacing. Video processors are found in all digital displays as well as many DVD players and other source devices.

If the objects in the video image are not moving, it is very easy to do the de-interlacing – the two fields can be weaved together and combined to form a complete frame. However, if the recording is performed in an interlaced manner, the two source fields that make up a complete frame are not recorded at the same time. Each frame is recorded as an odd field from one point in time, and then as an even field recorded 1/50th or 1/60th of a second later.

So, if an object in the video has moved in that fraction of a second, simply combining fields causes the errors in the image called “combing” or “feathering” artifacts.

An example of feathering or combing

Simplest Competitor Approach (Non-Motion Adaptive):
The simplest approach to avoid these artifacts is to ignore the even fields. This is called a non-motion adaptive approach. In this method, when the two fields reach the processor, data from the even fields are completely ignored.

The video-processing circuitry recreates or “interpolates” the missing lines by averaging pixels from above and below. While there are no combing artifacts, image quality is compromised because half of the detail and resolution have been discarded.

1-Half of the fields are discarded, and we move in on a section of the frame.
2-The missing fields are now recreated by taking the field above and below and averaging them together, resulting in poor quality interpolation.

View full size

More-advanced techniques have been adopted by virtually all standard-definition video processors, but this basic approach is still sometimes used for high-definition signals, due to the increased computational and data-rate requirements of higher video resolution.

With video processors from some competitors, only 540 lines from a 1080i source are used to create the image that makes it to the screen. This is true even for video processors from companies that may have been considered providers of flagship performance in the standard-definition era.

Advanced Competitor Approach (Frame-based Motion Adaptive):
More advanced de-interlacing techniques available from the competition include a frame-based, motion-adaptive algorithm. By default, these video processors use the same technique described above. However, by using a simple motion calculation, the video processor can determine when no movement has occurred in the entire picture.

If nothing in the image is moving, the processor combines the two fields directly. With this method, still images can have the complete 1080 lines of vertical resolution, but as soon as there is any motion, half of the data is discarded and the resolution drops to 540 lines. So, while static test patterns look sharp, video does not.

Frame-based motion-adaptive techniques are now common in standard-definition video processors. However, this is still rare in high-definition video processors due to the computational complexity of even frame-level high-definition motion detection.

Silicon Optix HQV Approach (Pixel-Based Motion Adaptive):
HQV processing represents the most advanced de-interlacing technique available: a true pixel-based motion-adaptive approach. With HQV processing, motion is identified at the pixel level rather than the frame level. While it is mathematically impossible to avoid discarding pixels in motion during de-interlacing, HQV processing is careful to discard only the pixels that would cause combing artifacts. Everything else is displayed with full resolution.

Only the pixels that would cause combing are removed. View full size

Pixel-based motion-adaptive de-interlacing avoids artifacts in moving objects and preserves full resolution of non-moving portions of the screen even if neighboring pixels are in motion.

“Second Stage” Diagonal Interpolation To recover some of the detail lost in the areas in motion, HQV processing implements a multi-direction diagonal filter that reconstructs some of the lost data at the edges of moving objects, filtering out any “jaggies.” This operation is called “second-stage” diagonal interpolation because it’s performed after the deinterlacing, which is the first stage of processing. Since diagonal interpolation is independent of the de-interlacing process, competitors have used similar algorithms with their frame-based de-interlacing approaches.

Diagonal Processing

View full size


Truth in Marketing
Silicon Optix is not the only company to implement pixel-based motion-adaptive de-interlacing, and it is important to recognize that all such de-interlacing is not identical. In order to implement a true per-pixel motion-adaptive deinterlacer, the video processor must perform a four-field analysis. In addition to the two fields being analyzed in the current frame, the two previous fields are required in order to determine which pixels are in motion. Clearly, if a competing de-interlacer does not evaluate four fields, it simply does not have the data necessary to perform true per-pixel motion-adaptive analysis. Some competing products implement region-based analysis, in which motion is determined by evaluating larger blocks of the image rather than complete frames or individual pixels. Obviously, then, a claim of “four-field” analysis alone does not imply per-pixel motion-adaptive de-interlacing.

HQV Processing continues to analyze at the per-pixel level using four-field analysis even in high-definition.

View full size


View full size

NEXT: Video Cadencing and Video/Film Detection, Noise Reduction, Detail Enhancement, 1024-tap Scaling

Video Cadencing and Video/Film Detection

Motion picture films are recorded at 24 frames per second. When the movie is released for the home on DVD or a television broadcast, those 24 frames must be converted into 60 interlaced fields. Consider four frames of film: A, B, C, and D.

The first step is to convert these four frames into eight fields. This transforms 24 frames per second (fps) into 48 interlaced fields per second. Then, to account for the faster rate of the NTSC standard (roughly 30 frames per second or 60 interlaced fields per second), it is necessary to repeat certain fields. This is done by adding an extra field every other frame. That is, both fields of frame A are recorded (A-odd, A-even), but three fields of frame B are recorded (B-odd, B-even, B-odd). The cycle repeats with frames C and D. This is called a 2:3 cadence because two fields of one frame are shown followed by three fields of the next frame.

Bad cadence

View full size

Good cadence

View full size

When this sequence is played back on a progressive-scan video display, it is possible to implement the same de-interlacing techniques described earlier (non-motion adaptive vs. motion adaptive, etc.). However, it is possible to perfectly reconstruct the original frames without losing any data. Unlike interlaced video, in which the two fields were recorded a fraction of a second apart, these fields were recorded at the same time in the same film frame and later separated into fields.

So, to display a video signal that originated as 24fps film, all a video processor needs to do is analyze the fields and determine that there is a regularly alternating pattern of two fields followed by three fields, etc. This recognition and reconstruction is called 3:2 pulldown, and it is found in all but the worst de-interlacers. Unfortunately, nothing is quite that simple.

Mixed Video and Film
Sometimes, further editing and post-processing is done on film that has been converted to video. This includes titles, transitions, and other effects. As a result, simply reconstructing full frames results in combing artifacts because parts of the image are best processed using a standard de-interlacing approach, while other parts will look better by detecting the right cadence and reconstructing the original frames.

Like the various approaches to standard de-interlacing, there are many approaches to dealing with mixed video and film. If the processor interprets the material as film, feathering artifacts will appear around the video portion; if the processor interprets the material as video, the film portion will be displayed at half of its resolution. Some processors determine whether there is more film or more video content and choose the approach with the greatest benefit. Since this usually means film, the result is feathering artifacts. Other processors are designed with the idea that these artifacts should never be seen and use the video de-interlacing techniques in all cases, at the expense of as much as half the video resolution.

HQV Processing, on the other hand, uses per-pixel calculations for all of its processing. This means it is possible for the HQV processor to implement cadence-detection strategies for the pixels that represent film content while implementing pixel-based motion-adaptive de-interlacing for the video content that has been superimposed.

Other Cadences
The HQV Processing advantage of true per-pixel de-interlacing becomes even more evident when dealing with other cadences. Although 24fps film and its associated 2:3 video cadence is the most common format, it isn’t the only cadence used today.

Sometimes, TV stations accelerate their film-based movies and TV shows by dropping every twelfth field to make room for more commercials. This speedup is usually too small to be noticed by the average viewer, but these “vari-speed broadcasts” end up having unusual cadences such as 3:2:3:2:2. If a de-interlacer is unable to detect this sequence, as with most of the competition, half the resolution is lost.

The variety of cadences does not end there. Professional DVCAM camcorders are increasingly used in television and film production. In order to maximize the recording time, these camcorders use a 2:2:2:4 cadence or a 2:3:3:2 cadence to store the progressive source signal as 480i on the tape. Animation gets even more creative with cadences ranging from 5:5 to 6:4 or 8:7 for Japanese anime.

Most competitors’ processors compare the incoming fields and try to match them against known sequences such as 3:2 or 2:2 in order to select the right decoding. This works for the most part, but there can be a short delay before the processor is able to “lock on” and determine the right cadence. In addition, when the video processor encounters an unusual sequence such as animation or DVCAM, it may resort to discarding half the data if is a non-motion adaptive processor.

With HQV processing, there is never any confusion about cadence. Instead of trying to match the incoming video against known patterns, HQV processing simply identifies complete frames as they come in. HQV processing is able to identify all known cadences, no matter how uncommon, and it can also detect cadences that have not yet been invented.

No matter what type of video you’re watching or where it comes from, HQV processing will always provide the best reconstruction of the image.

Noise Reduction

Random noise is an inherent problem with all recorded images; the result is often called picture grain. Not only does noise get introduced during post-production editing or the final stage of video compression, but it is also present at the source in the form of film grain or imaging-sensor noise. Noise-reduction algorithms can minimize the grain in a picture.

The simplest approach to noise reduction is to use a spatial filter that removes high-frequency data. In this approach, only a single frame is evaluated at any given time, and parts of the image that are one or two pixels in size are nearly eliminated. This does remove the noise, but it also degrades the image quality because there is no way to differentiate between noise and detail. This approach can also cause an artificial appearance in which people look like their skin is made of plastic. This represents the most widely used noise-reduction approach.

Bad noise reduction

View full size

Good noise reduction

View full size

A temporal filter takes advantage of the fact that noise is a random element of the image that changes over time. Instead of simply evaluating individual frames, a temporal noise filter evaluates several frames at once. By identifying the differences between two frames and then removing that data from the final image, visible noise can be reduced very effectively. If there are no objects in motion, this is a virtually perfect noise-reduction technique that preserves most of the detail. This approach is used by many high-end competitors.

However, a problem arises if there are moving objects in the image, which also cause differences from one frame to the next; of course, these differences should be retained. If moving objects are not distinguished from noise, a ghosting or smearing effect is seen.

Bad noise reduction

View full size

Good noise reduction

View full size

HQV processing uses a per-pixel motion-adaptive and noise-adaptive temporal filter to avoid the artificial appearance and artifacts associated with conventional noise filters. To preserve maximum detail, moving pixels do not undergo unnecessary noise processing. In static areas, the strength of noise reduction is determined on a per-pixel basis, depending on the level of noise in the surrounding pixels as well as in previous frames, allowing the filter to adapt to the amount of noise in the image at any given time. The end result is a natural-looking picture with minimal noise and grain and maximum preservation of fine details.

NEXT: Detail Enhancement, More

View full size

Bad processing results in poor dynamic range



View full size

Good processing Detail Enhancement
Detail enhancement, also called sharpening, is a necessary component of all digital imaging, both standard definition and high definition. Unfortunately, due to the historically poor implementations of sharpening algorithms, this process has received a reputation as something to avoid.

All digital video goes through a lowpass anti-aliasing filter to prevent false color and moiré effects that can occur during the digitization process. The filter improves overall image quality, but it necessarily blurs some of the detail. The data-compression stage can also remove some detail. Fortunately, much of the lost detail can be mathematically recovered.

Bad detail enhancement

View full size

Good detail enhancement

View full size


Because the human visual system perceives sharpness in terms of apparent contrast, exaggerating the differences between light and dark can produce what appears to be a sharper image. Unfortunately, due to rudimentary implementations of sharpening in the past, this process has been associated with artifacts known as “ringing” or “halos” in which objects are surrounded by a bright white edge. The resulting image appears harsh and does not reflect what was originally captured. The halos can sometimes be more distracting than the softness from the uncorrected image. For that reason, it is often recommended that users turn down the sharpening on video devices.

Bad film detail enhancement


View full size


Good film detail enhancement

View full size


HQV Detail Enhancement technology is different. By using a more conservative algorithm and selectively identifying the area of blur before processing, HQV Detail Enhancement avoids halo or ringing artifacts at even the highest setting. Of course, it is also possible to disable HQV Detail Enhancement if the source has already applied sharpening. A key benefit of HQV Detail Enhancement is that, when used in conjunction with our 1024-tap scaler, standard-definition TV can be delivered at near high-definition quality.

1024-tap Scaling
Converting standard-definition video to high-definition video involves resizing an image to contain as much as six times the number of pixels it had originally. How this is done determines the quality of the resized image.

The most basic video processors perform their scaling calculations by analyzing no more than four pixels in the source image to create one pixel in the final image. This represents what is called a 4-tap scaler. (Without getting too technical, the number of "taps" determines the number of pixels that are analyzed.) With all other things being equal, a larger number of taps will result in better scaling quality. The average scaler uses no more than 16 taps. However, even this level of scaling can still produce blurry images.

HQV processing uses scaler with an unprecedented 1024 taps. This level of quality reflects the fact that HQV processing has its roots in Teranex algorithms, which were developed for defense and military image analysis. For every pixel, the HQV processor evaluates the surrounding 1024 pixels in order to provide the best image quality when scaling the image up from standard definition. Again, when this advanced upsampling technology is combined with HQV Detail Enhancement, standard-definition broadcast TV and DVD can be enjoyed with near-HD quality.

10-bit 4:4:4 Internal Data Paths
Not only does HQV processing implement some of the most advanced algorithms for video processing, but these calculations are performed using 10-bits-per-channel internal data paths with full 4:4:4 color processing. (The term "4:4:4" refers to the fact that the color information is processed at full horizontal resolution, and 10-bit data paths provide 1024 steps of brightness or luminance.) In comparison to conventional video processors that only have 8-bit, 4:2:2 color processing (in which the color information is processed at half the horizontal resolution with only 25% of the brightness steps found in 10-bit systems), HQV processors maintain 64 times as much data throughout all of their computations (30 bits vs. 24 bits). Simply put, by reducing the number of round-off errors, HQV products can preserve all the fine detail and dynamic range found in the original source.

Horizontal Text, Bad Processing

Horizontal Text, Good Processing

HQV Processing: Setting The Standard for Top Quality De-Interlacing
As you can now understand, HQV processing represents an enormous leap in video processing, with true flagship performance in de-interlacing, noise reduction, and scaling with both standard-definition and high-definition signals. Silicon Optix designed HQV processing as a no-compromise solution.

Most competing video processors only have enough computational horsepower to deal with one video stream. As a result, when using the picture-in-picture or split-screen features of a video display, the second video image is not processed with any advanced technologies, resulting in a loss half the picture detail. In fact, in many competing products, using picture-in-picture or split screen causes both images to lose half of their picture detail. By comparison, video processors with HQV technology have sufficient power to perform the full suite of image-enhancing processing calculations on two simultaneous high-definition sources without losing any resolution.

Vertical Text, Bad Processing

View full size

Vertical Text, Good Processing

View full size


About the Author

Jed Deame is the Co-Founder and General Manager of Teranex Business Unit, Silicon Optix Inc.

Jed Deame received his BS in Electrical Engineering from the University of Central Florida. He then spent twelve years developing parallel image processing systems at Lockheed Martin, the initial developer of the Teranex technology. As a co-founder and GM of Teranex, he developed the concept for the Teranex Video Computer platform and helped bring the technology to the single chip Silicon Optix Realta HQV video processor.


print

email

rss

Bookmark and Share

Joinpost comment




Please sign in to post comment

Navigate to related information

Most Popular

Product Parts Search

Enter part number or keyword
PartsSearch


FeedbackForm