[Part 1 explains the basics of audio converters and the common standards for connecting processors to these converters. (For more on numeric formats, see Fixed vs. floating point: a surprisingly hard choice.) Part 3 of this series reviews data-management schemes such as double-buffering and discusses the basics of audio algorithms.]
Dynamic Range and Precision
You may have seen dB specs thrown around for various products available on the market today. Table 1 lists a few fairly established products along with their assigned signal quality, measured in dB.
Table 1: Dynamic range comparison of various audio systems.
So what exactly do those numbers represent? Let's start by getting some definitions down. Use Figure 1 as a reference signal for the following "cheat sheet" of the essentials.
Figure 1: Relationship between some important terms in audio systems.
The dynamic range of the human ear (the ratio of the loudest to the quietest signal level) is about 120 dB. In systems where noise is present, dynamic range is described as the ratio of the maximum signal level to the noise floor. In other words,
Dynamic Range (dB) = Peak Level (dB) - Noise Floor (dB)
The noise floor in a purely analog system comes from the electrical properties of the system itself. In digital systems, audio signals also acquire noise from the ADCs and DACs, as well as from the quantization errors due to sampling.
Another important measure is the signal-to-noise ratio (SNR). In analog systems, this means the ratio of the nominal signal to the noise floor, where "line level" is the nominal operating level. On professional equipment, the nominal level is usually 1.228 Vrms, which translates to +4 dBu. The headroom is the difference between nominal line level and the peak level where signal distortion starts to occur. The definition of SNR is a bit different in digital systems, where it is defined as the dynamic range.
Now, armed with an understanding of dynamic range, we can start to discuss how this is useful in practice. Without going into a long derivation, let's simply state what is known as the "6 dB rule". This rule is key to the relationship between dynamic range and computational word width. The complete formulation is described in the equation below, but in shorthand the 6 dB rule means that the addition of one bit of precision will lead to a dynamic range increase of 6 dB. Note that the 6 dB rule does not take into account the analog subsystem of an audio design, so the imperfections of the transducers on both the input and the output must be considered separately.
Dynamic Range (dB) = 6.02n + 1.76 ≈ 6n dB
n = the number of precision bits
The "6 dB rule" dictates that the more bits we use, the higher the audio quality we can attain. In practice, however, there are only a few realistic choices of word width. Most devices suitable for embedded media processing come in three word width flavors: 16-bit, 24-bit, and 32-bit. Table 2 summarizes the dynamic ranges for these three types of processors.
Table 2: Dynamic range of various fixed-point architectures.
Since we're talking about the 6 dB rule, it is worth mentioning something about the nonlinear quantization methods that are typically used for speech signals. A telephone-quality linear PCM encoding requires 12 bits of precision. However, our ears are more sensitive to audio changes at small amplitudes than at high amplitudes. Therefore, the linear PCM sampling is overkill for telephone communications. The logarithmic quantization used by the A-law and μ–law companding standards achieves a 12-bit PCM level of quality using only 8 bits of precision. To make our lives easier, some processor vendors have implemented A-law and μ–law companding into the serial ports of their devices. This relieves the processor core from doing logarithmic calculations.
After reviewing Table 2, recall once again that the dynamic range of the human ear is around 120 dB. Because of this, 16-bit data representation doesn't quite cut it for high quality audio. This is why vendors introduced 24-bit processors. However, these 24-bit systems are a bit non-standard from a C compiler standpoint, so many audio designs these days use 32-bit processing.