[Part 2 discusses numeric formats as they relate to audio processing, with a focus on dynamic range and precision. For more on audio converters, see Basics of ADCs and DACs. For the accompanying video tutorial, see Fundamentals of embedded video.]
Audio functionality plays a critical role in embedded media processing. While audio takes less processing power in general than video processing, it should be considered equally important.
In this article, the first of a three-part series, we will explore how data is presented to an embedded processor from a variety of audio converters (DACs and ADCs). Following this, we will explore some common peripheral standards used for connecting to audio converters.
Converting between Analog and Digital Audio Signals
All A/D and D/A conversions should obey the Shannon-Nyquist sampling theorem. In short, this theorem dictates that an analog signal must be sampled at a rate (the Nyquist sampling rate) equal to or exceeding twice its bandwidth (the Nyquist frequency) in order for it to be reconstructed in the eventual D/A conversion. Sampling below the Nyquist sampling rate will introduce aliases, which are low frequency "ghost" images of frequencies that fall above the Nyquist frequency. For example, if we take an audio signal that is band-limited to 0-20 kHz, and sample it at 2 - 20 kHz = 40 kHz, then the Nyquist Theorem assures us that the original signal can be reconstructed without any signal loss. Sampling this 0-20 kHz band-limited signal at anything less than 40 kHz will introduce distortions due to aliasing. Figure 1 shows the aliasing effect on a 20 kHz sine wave. When sampled at 40 kHz, a 20 kHz signal is represented correctly (Figure 1a). However, the same 20 kHz sine wave sampled at 30 kHz actually looks like a lower-frequency alias of the original sine wave (Figure 1b).
(Click to enlarge)
Figure 1. (a) Sampling a 20 kHz signal at 40 kHz captures the original signal correctly
(b) Sampling the same 20 kHz signal at 30 kHz captures an aliased (low frequency ghost) signal.
No practical system will sample at exactly twice the Nyquist frequency, however. This is because restricting a signal into a specific band requires an analog low-pass filter. Since analog filters are never ideal, high frequency components above the Nyquist frequency can still pass through, causing aliasing. Therefore, it is common to sample above the Nyquist frequency in order to minimize this aliasing. For example, the sampling rate for CD audio is 44.1 kHz, not 40 kHz, and many high-quality systems sample at 48 kHz in order to capture the 0-20 kHz range of hearing even more faithfully.
For speech signals, the energy content below 4 kHz is enough to store an intelligible reproduction of a speech signal. For this reason, telephony applications usually use only 8 kHz sampling (= 2 - 4 kHz). Table 1 summarizes some sampling rates used by familiar systems.
Table 1. Commonly used sampling rates.
The most common digital representation for audio is called PCM (pulse-code modulation). In this representation, an analog amplitude is encoded with a digital level for each sampling period. The resulting digital wave is a vector of snapshots taken to approximate the input analog wave.
Since all A/D converters have finite resolution, they introduce quantization noise that is inherent in digital audio systems. Figure 2 shows a PCM representation of an analog sine wave (Figure 2a) converted using an ideal A/D converter. In this case, quantization manifests itself as the "staircase effect" (Figure 2b). You can see that lower resolution leads to a worse representation of the original wave (Figure 2c).
For a numerical example, let's assume that a 24-bit A/D converter is used to sample an analog signal whose range is -2.828 V to 2.828 V (5.656 Vpp). The 24 bits allow for 224 (16,777,216) quantization levels. Therefore, the effective voltage resolution is 5.656 V / 16,777,216 = 337.1 nV. In the second part of this series, we'll see how codec resolution affects the dynamic range of audio systems.
(Click to enlarge)
Figure 2. (a) An analog signal (b) Digitized PCM signal (c) Digitized PCM signal using fewer bits of precision.