Some thoughts about clock recovery
The conversion clock of a data converter is very critical to its linearity, since any variation in the regularity of the clock results in sampling jitter, which causes phase modulation of the converted signal . This is most easily assessed by passing a high-amplitude, high-frequency signal through the converter and looking for phase modulation sidebands or skirts in the spectrum of the output.
Converter clocks are usually derived from a phase-locked loop (PLL), rather than a local oscillator, since it is unusual for a device to be able to be system clock master all the time: it must be able to lock to an external reference or to its digital audio or computer interface. In order to lock to all references, the lock range may need to be wide (perhaps ±1000ppm) and usually many different sample rates must be accommodated. See  for a chilling glimpse of the enormity of the problem.
The need to lock our converter clock to various wide-ranging references while maintaining low jitter has tended to be one of the most difficult challenges in converter design, since it embodies some tough trade-offs. Although other applications (e.g., telecoms) had pretty much solved this problem before digital audio was thought of, we audio people just had to start from scratch and it's taken a couple of decades to reinvent a good solution.
The situation has become more challenging now that we need to lock to software-generated syncs and timestamps arriving over computer interfaces, since they can embody large amounts of jitter with uncontrolled spectrum, and may not come around as often as we'd like .
Figure 3: Basic analogue PLL.
Figure 3 shows a conventional analogue PLL, such as might be used to lock a converter to a reference clock. A phase comparator continually compares the external reference with the regenerated version, and decides whether we should speed up or slow down our VCO to make them match in frequency and phase. But let's not be hasty: we need to respond in a leisurely fashion or else we will track any incoming jitter. So we smooth out the up/down requests with a low-pass loop filter before passing them to the control input of the VCO. So far, so good.
But the tough tradeoffs are mostly about choosing the right loop filter characteristics and the right sort of VCO. We need to reject incoming jitter ('jitter rejection') down to low frequencies in order not to be prey to audible sampling jitter ). That will also potentially allow us to accommodate a low comparison frequency (such as a reference comprising infrequent software time stamps, or a video sync which only lines up with an audio sample every few seconds ).
Unfortunately a low loop filter corner frequency will make our PLL slow to lock up, but we can't have everything. Worse, though, it will prevent suppression of the phase noise of the VCO around the loop. So if we are to avoid unacceptable 'intrinsic jitter' we need to keep the loop filter corner frequency high, or else choose a type of VCO which has very low phase noise in the first place.
A good low-noise oscillator is a quartz VCXO. But because of their high-Q, they don't like to be pulled very far from their natural frequency and so may not have enough pull range for our requirements (perhaps only ±100ppm). On the other hand, a humble RC multivibrator VCO can have all the pull range we need, but is essentially untuned and so has large phase noise and is prey to all manner of interference. A few other VCO options exist in between these extremes.
To solve the pull-range problem with quartz, we could commission some special VCXOs made of a special material with a lower Q, e.g. Langasite (LGS) or Lithium Tantalate. These are quite expensive.
We could use a tuned-circuit (LC) VCO, which has much lower-Q than crystals, but at least they have Q, so they can be designed with much lower phase noise than a multivibrator. Their wide range can cover both n*44k1 and n*48k rates, and easily accommodate ±1000ppm reference inaccuracy. Overall, not a bad choice; but if we're being picky, the intrinsic jitter isn't going to be top-class if we drop the corner frequency of the loop to where we'd like it.
In considering these trade-offs, it is helpful to look at some real-world examples of applying analogue PLL technology to converter subsystems (Figure 4).
Figure 4: Some analogue PLL schemes.
Top: using a basic PLL chip, or, for an AES3 or SPDIF DAC, using the PLL in the DIR. Problem: the VCO is low-Q with large phase noise (usually a RC multivibrator, which is vulnerable to all sorts of interference, especially power rail and ground noise), and the corner frequency of loop filter is high. Result: intrinsic jitter is high, jitter rejection at audible frequencies is poor.
Middle: using a purpose-designed PLL for converter clock recovery, with higher-Q VCO and a low loop-filter corner frequency. Result: intrinsic jitter and jitter rejection are good, but we have to carry the cost of two (or more) VCXOs and the pull range may be insufficient.
Bottom: using two cascaded PLLs; the first one with a LGS VCXO (for wide lock range) and a low corner frequency filter, which does our jitter rejecting; the second one perhaps with a tuned circuit VCO and a high loop-filter corner frequency (we don't need reference jitter rejection now, it's already gone) so as to cover all sample rate multiples. If we make the LGS VCXO frequency n*48k and m*44k1, we can provide a sample-rate reference for the second PLL with a simple programmable divider. Result: a pretty good solution; it has good intrinsic jitter and jitter rejection, and it works for all sample rate multiples of 44k1 and 48k, and has a wide lock range. But even with one VCXO it's still quite expensive, and it still may have questionable performance at low comparison rates.
An even better solution is to adapt the dual-loop architecture as a 'hybrid PLL', by implementing the first PLL in the digital domain, so that the trade-off of phase-noise vs. low corner frequency is broken: the loop filter and the 'VCO' are both entirely digital. The VCO (now an 'NCO') is very jittery, since it is a varying integer division of a fixed master clock, but inclusion of a sigma-delta modulator in the loop means that its jitter can be confined to very high frequencies.
It is therefore straightforward to cascade the NCO output into an analogue PLL with a very high corner-frequency, which can therefore use a cheap VCO without intrinsic jitter being a problem. Furthermore we can change the corner-frequency in software so as to achieve fast lock and then extreme LF jitter rejection. The hybrid PLL is very cheap, since it requires no resonator-based VCO.
Similar solutions are now available for audio use from a number of vendors (for example , whose topology is shown in Figure 5) – some of them are even built into data converters.
Finally, I should mention that with a bit of thinking outside the box there is another way to skin this cat. Modern low-cost sample-rate-converter chips (SRCs) are achieving performance which can arguably exceed that of the data converter itself. So you might elect to operate the conversion element at a fixed rate provided by a local crystal (thus eliminating sampling jitter) and to rate convert the converter input or output data. This approach can lead to other issues, and places the responsibility on the SRC to be able to achieve jitter rejection to the same standard as in the PLL model whilst also protecting the quality of your audio crown jewels.
Figure 5: Hybrid PLL, from .