By Paul Wheeler - Engineering Manager, Analog Devices K.K., Tokyo, Japan and Sharon St. Ours - Program
Manager, Analog Devices Inc., Norwood, Mass.
(Ed Note: This article first appeared in Planet Analog)
Surround sound home theater systems have become an essential part of the home entertainment
experience. Over 16 million homes have installed a theater system in the last
few years. All indicators show that this market will continue to grow over the
next decade. This provides the home theater components designers with both great
opportunities and design challenges.
Home theater systems range in features and in price. The basic system is the
"Home Theater in a Box" which is available from most electronics distributors
and resellers. The system will generally consist of a home theater receiver,
surround speakers (usually four), a center channel speaker and a powered
subwoofer. Often a DVD player is also included as well as a remote control. The
DVD player will often accommodate newer formats like DVD-Audio or SACD (Super
Audio Compact Disc), and will output a variety of video formats such as
component video, S-Video or progressive scan for high definition television.
For the audiophile or theater buff, a higher-end system is in order. The
television will be upgraded to a plasma screen and the audio components will be
selected for their enhanced features. Certainly a high-end audio home system
will include seven speakers and a subwoofer to take advantage of the full 7.1
surround formats available today. The DVD player will be able to play multiple
formats automatically such as DVD-Audio, DVD-Video, DVD-R, Video CD, CD-R/RW
media and MP3 discs.
The receiver, or AVR, is the heart of the high-end home theater system. This
component will handle the audio decode algorithms like Dolby Digital and DTS. In
addition the receiver will reproduce the sound effects of a dubbing stage for
the "true" theater experience in your home. The audio decoders include virtual
surround, which is a multi-channel effect from just two speakers, stereo to
multi-channel decode, multi-channel surround using all six or eight speakers, or
discrete surround decoding. The AVR component will also handle multiple zones
and an array of inputs and outputs including coaxial and optical.
All of the high-end systems also include an audio calibration feature. This
function detects where the speakers are placed in a room, the shape and size of
the room, and the room's reflective and absorption qualities. The system will
automatically adjust the sound to optimize the setup for any room. Many of the
features that are adjusted can also be manually set according to the listener's
preference. These include independent channel control, variable crossover
frequency, and audio delay adjustment.
With all the various combinations of decode formats, virtualization and post
processing, this system has to be highly flexible. A high performance digital
signal processor (DSP) is ideal for running the decoders and has the
programmability and flexibility to perform the additional functions like
automatic room equalization. Whether the signal processing is performed in the
DVD component or the AVR system, the digital signal processor is the central
processing engine of the entire system. With such audio sources as PCM, SACD,
DVD Audio, DTS, AC3 and post processing algorithms such as DPLII, Neo6, Surround
EX, ES Matrix, Bass, and delay, the processor must support a myriad of
combinations while dynamically detecting changes in the input streams to invoke
the correct decoding software. .
A Mid to High end AVR
Figure 1 shows a basic block diagram for a mid to high-end home theater
system that uses two 32-bit floating point digital signal processors. One signal
processor may be sufficient depending on the individual system's processing
requirements. For lower cost systems, the signal processor may run the various
processing algorithms in internal ROM and RAM excluding the need for external
Figure 1. Components for a mid- to high-end AVR system.
To describe the types of processing that an audio signal processor must
perform, we will first explore the audio sources that are available on the
Automatic input detection
The audio DSP must be capable of recognizing the various formats in real time
and invoking the suitable decoder codec as soon as possible. Any error in
detection could result in noise at the output. This auto-detection process must
run frame-by-frame in parallel with the decoding process. (The Analog Devices'
SHARC signal processor, for example, makes use of a modified Harvard
architecture to input samples in the background while decoding the previous
frame.) The auto-detection process is based on the IEC61937 International
standard for non-linear PCM encoded bit streams. This standard defines how
encoded streams are sent to the DSP via the Sony-Philips Digital Interface
Format (S/PDIF) receiver. The audio processor must determine if the input stream
is an encoded stream (such as AC3, DTS) or is a PCM stream. The auto-detect
module uses the IEC61937 standard as a base for this recognition in tandem with
the stream characteristics of each format, which will be detailed in the next
Figure 2. Audio DSP functional blocks include a source detector and a
variety of decoders.
One example is Dolby Digital (Dolby AC3), a multi-channel audio encoding
technology developed by Dolby Labs, and publicly announced in mid 1991. This
encoding mechanism is a "lossy" algorithm, which means that it takes advantage
of the masking effects of the human ear to reduce the amount of information
needed to convey a particular sound. This algorithm uses a "dynamic bit
allocation" methodology. Frequencies that the human ear are more sensitive to
are allocated more bits to accurately reconstruct the original sound. The
algorithm is capable of encoding from one (mono) to 5.1 audio signals, with the
main signals supporting 20 Hz to 20KHz, and the ".1" channel supporting 3 to 120
Hz (LFE channel). The algorithm can support sampling frequencies of 32, 44.1 and
48kHz, and data rates from 32kbps to 640 kbps. (The basic Dolby Digital decoding
method is shown in Figure 3.)
Figure 3. The Basic Dolby Digital (AC3) decoding process.
It should be noted that the actual decoding process is more complicated than
the simple diagram above, as we have not included error processing, channel
de-coupling, de-matrixing or dynamic variation of filter banks.
The Dolby Digital (AC3) payload stream is as defined in the ATSC standard for
Digital Audio Compression and (is shown in Figure 4).
Figure 4. The Payload frame format for a Dolby AC3 encoded stream includes
synch and CRC, as well as six audio frames.
The frame contains 2 CRC (cyclical redundancy check) words that are used to
detect errors in the frame. The frame also contains a BSI field that holds
information about the sample rate, the data rate, the number of encoded
channels, and the audio services that are available in the stream such as
karaoke mode or dynamic range control. An auxiliary field is also reserved for
control status. The payload frame contains six audio blocks, which each contain
256 audio samples, resulting in a fixed time of 1,536 audio samples. The actual
length of the frame will vary according to sample rate and data rate.
The DTS method of multi-channel encoding involves two basic processes,
sub-band filtering and ADPCM (adaptive pulse code modulation). The encoder
operates on various PCM frame sizes, with the longer frame sizes normally kept
for low bit rate applications to maintain quality. The current supported frame
sizes are 256, 512, 1,024, 2,048 and 4,096 samples. The main difference from the
Dolby encoder is the provision for further extensions to the DTS format such as
the recent additions to the encoder family of DTS 6.1 and DTS 96/24 (a 96-kHz
sampling rate and 24-bit dynamic range). DTS 6.1 contains the same discrete 5.1
channels as most DVDs, but also includes a discrete surround back channel. Up to
the release of DTS 6.1, the Surround Back channels were created using matrix
processing on the Surround Left and Right channels.
DTS 96/24, the newest addition in the quest for high quality audio, uses the
encoding method shown in Figure 5. The "core" signal, which is a 24bit, 48kHz,
5.1 signal is created by filtering the 96kHz signal to a 48kHz bandwidth and
encoding using the original DTS encoder. This encoded "core" signal is then
decoded to recreate a 5.1, 48kHz PCM signal. This reconstructed signal is then
subtracted from the original 96kHz, 5.1 PCM signal to give the difference
signal, which includes the coded error information from 0 to 48kHz and the
original 48 to 96kHz high frequency information. This "difference" signal is
then DTS encoded as an "extension" stream, and combined with the "core" encoded
stream to give a DTS 96/24 frame.
Figure 5. High sampling rate audio encoding uses extension frames to
affect a 96-kHz rate.
The encoded frame for DTS is shown in Figure 6. The data frame is made up of
five main parts: The synchronization word which defines the frame start; The
frame header, which contains information about number of channels and bit rate.
There are up to 16 sub-frames, which contain the "core" encoded audio data, an
optional user data space and extension data. Extension data may contain the high
frequency components (for DTS96/24) or the discrete Surround Back channel (for
Figure 6. The encoded frame for DTS includes a data frame made up of five
main parts: The synchronization word which defines the frame start; the frame
header, which contains information about number of channels, and bit rate,
sub-frames, a user space, and extension data.
The decode method for this frame is shown in figure 7. As can be seen, the
audio decoder, depending on its processing power, can choose to ignore the
extension stream and decode only the "core" audio. This backward compatibility
means that older decoders can still effectively decode DVDs with extension
stream inclusion without need of additional software.
Figure 7. On DTS decode, the audio decoder can choose to ignore the
extension stream and decode only the "core" audio.
End of Part 1