It is becoming apparent that end-to-end solutions are needed, and several trade groups are actively studying the problem. SMPTE has created, within the S22 Committee on Television Systems Technology, an Ad Hoc Group on Lip Sync issues to address the problem and produce guidelines. Work on a coordinated studio-centric solution will probably include problem assessment, current practices, control signals, and potential solutions.
Television industry standards organizations have also become involved in setting standards for audio/video sync errors. See for example ATSC Document IS-191
. Although the AC-3 digital audio standard is mature, implementations have varied, in particular with regard to lip sync. The ATSC Technology Standards Group on Video and Audio Coding (TSG/S6) has been directed to look into these issues, and has established two working groups to gather implementation data and report back with recommendations.
In Canada, World Broadcasting Unions International Satellite Operations Group (WBU-ISOG) has conducted tests on satellite encoders and decoders. An EBU audio group has performed tests for SDTV receivers. In Japan, the Japan Electronics and Information Technology
Industries Association (JEITA) IEC-TC100 has started investigations on TV receiving devices.
Lip-synch detection and reduction methods
Aside from a handful of proprietary solutions, no standard solution has yet been proposed. The causes are many, but well-defined lip sync detection solutions are hard to find. It is difficult for humans to easily determine how much video-to-audio delay or advance is present.
Existing CE methods for reducing the timing error between the audio and video portions of content have included manual adjustment based on a delay factor determined by observation by the consumer, or automatic adjustment as in HDMI 1.3 based on a previously estimated delay factor. The disadvantage of a manual measurement and adjustment is that it is based on a human-perceived delay.
The method of automatically delaying the audio by an estimated factor, based on the expected delay in the video signal during processing, is also an imperfect solution since the audio and video signals may be routed through a number of devices, and may undergo a number of processing steps. Each additional device or step can impact the ultimate lip sync error.
Some HDMI 1.3 A/V Receivers are incorporating an audio synchronizer, which can be used to correct or maintain proper audio/video sync. In order to correct audio/video sync problems the HDTV
outputs timing delay information, propagating the amount of delay the video signal experiences. The audio synchronizer receives the delay information and in response delays the audio by an equivalent amount, thereby theoretically maintaining proper synchronization.
The actual delay needed for synchronization will depend on the type of audio and video signals, and the current video mode. Video delays can frequently and rapidly change by large amounts, requiring audio artifacts such as pops, clicks, gaps, or pitch errors. Unfortunately, the video delays frequently make quick and large changes. In order to maintain proper audio/video sync, the audio delay needs to track these video delay changes. This rapidly changing information is typically not conveyed in real-time with today's products.
Next: Audio/video synchronization technologies for broadcasting