Increasingly higher-bandwidth requirements continue to drive development and demand for 40G and 100G systems. Example consumer applications include YouTube, Facebook, smart phones, and IP-TV. Governmental and business demands compound the urgency with a variety of complex data intensive solutions including weather prediction, financial analysis, genomics research, and design simulation. Further, the rapid emergence of cloud computing for both personal and business use adds additional challenges for high-volume, high complexity data transmission.
To implement these link speeds, SerDes devices must meet tighter performance specifications, with extremely high speeds running at extremely low bit-error-rates (BER). As BER trends lower, the quality of the clock source becomes critical, because the random sources of phase jitter are multiplied by scalar factors (which can exceed 16) for the purpose of link timing closure. Thus, to a large degree, the quality and performance of the link depends on the Phase-Loop Lock (PLL) circuit in the SerDes. To meet the extremely low bit error rate (BER) specifications of 10-12 and 10-15, the PLL must exhibit ultra-low jitter, on the order of sub-600fs.
System requirements for SerDes have become increasingly demanding due to link speeds and new standards, yet 40G/100G transmissions must retain backwards compatibility. As the dominant standard for wireline communications, high-speed Internet builds on the ubiquity of the 10 Gbps serial Ethernet standard (10G Base-KR, XAUI, RXAUI). The new 40G/100G Ethernet standard leverages the existing 10 Gbps standard to provide four to ten times the bandwidth. By using a multi-lane approach, systems designers achieve much higher aggregate data rates than the fundamental electrical line rate. The XLAUI standard (40G Ethernet) achieves 40 Gbps with four 10.3125 Gbps lanes while CAUI (100G Ethernet) reaches 10 Gbps with ten lanes of 10.3125 Gbps.
Both the 40G XLAUI and 100G CAUI standards retain the Media Access Control (MAC) frame architecture and link encoding (64b/66b) used in 10 Gbps systems. This minimizes the impact of upgrading systems from 10 Gbps to 40 or 100 Gbps, respectively. This explosive growth in demand has stressed enterprises, data centers, and core networking areas the most. These network elements are leading the early adoption of 40G/100G technology.
The next generation of high-speed I/O standards for networks, computer I/O buses, and storage area networks supports these projected bandwidth requirements. Currently, 5 to 6 Gbps typifies current high-speed transceiver data rates and supports the following standards: PCIe Gen 1 and Gen2; SATA-3; Optical Internetworking Forum (OIF) CEI-6G and CEI-11G electrical standards; and Interlaken.
By contrast, the new 40G/100G Ethernet protocols rely on 8 to 11 Gbps transceiver rates and the aforementioned multi-lane architecture. The SerDes Framer Interface (SFI) standard SFI-S, based on the same electrical standards as Interlaken, also reaches 100 Gbps to support 100G optical networks. The PCIe Gen3 computer I/O bus specification uses the slower rate of 8 Gbps, but defines x1, x2, x4, x8, x12 and x16 configurations.
Using the PCIe standard, designers can utilize the configurations to address a wide variety of system requirements as needed by individual designs. Even as the industry begins to migrate to 40G/100G, system architects are already anticipating 25 Gbps transceiver rates and beyond. Figure 1 charts bit rates since 1995 and projects growth along the same line to 2015.
Figure 1. Map of I/O bit rates 1995 to 2015.PLL, transmitter, and receiver design
There are a variety of techniques that can be used to design the critical SerDes circuits. A good transmitter will be compliant to the standards – offering a wide swing range, robust ESD, accurate termination and equalization, and exceeding the return loss. The transmitter will typically be a source of deterministic jitter (DJ) through its power supply noise rejection characteristics. Thus having good local decoupling is critical.
In a typical example, the transmitter (TX) serializes ten differential data lines into a single high-speed differential output using a 20-to-1 multiplexer. The TX feeds the output to a three-tap finite impulse response (FIR) filter-based equalizer. The filter pre-distorts the transmitted pulse to compensate for frequency-dependent channel loss characteristics. In the final two stages, a pre-driver and a driver boost the differential signal onto the channel and provide a nominal 100-ohm termination.
The receiver should also be compliant to the standard in the same way the transmitter is, however in addition we consider the jitter rejection properties of the clock/data recovery (CDR) module. The CDR is a 2nd
order system that tracks the phase and frequency of the incoming data stream and recovers a clock which is centered at an ideal spacing relative to the data-eye. In older technologies, it was common to use analog implementations for the CDR function, a PLL for example, however digital circuits are an increasingly common design choice because of advances in process scaling, and a desire to minimize power dissipation (analog CDR’s rely on power-hungry supply regulators). The digital CDR suffers from dithering, however this is negligible in a well designed implementation. The CDR will track and reject jitter in the incoming data-stream up to its loop bandwidth.
In a typical implementation, the receiver presents a nominal 100-ohm termination to the incoming differential channel. The data stream passes through a variable gain amplifier (VGA) and continuous time linear equalizer (CTLE) with peaking control to compensate for transmission channel losses. A non-linear adaptive decision feedback equalizer (DFE) then addresses intersymbol interference (ISI). Following sampling, the serial data arrives at the deserializer logic for conversion back to ten differential signals. A phase interpolator takes inputs from the DLL, clock data recovery (CDR) logic, and bang bang phase detector (selected for its relative insensitivity to data patterns) and sets the timing for the samplers.
Finally, since the PLL characteristics determine high-speed link performance, by design, the PLL minimizes jitter across the entire operating range. Simply stated, the PLL is the critical piece of the puzzle that is SerDes design.
There are several ways to implement PLL’s – through the use of classical ring oscillators, through digital circuits, through the use of LC oscillators. Ring oscillators are quite common, however they tend to have noise performance which is unacceptable for high-data-rate designs, on the order of 1ps RMS (recall, random jitter sources are multiplied by a factor of 15-20 in timing calculations depending on the BER desired). LC PLL’s offer superior noise performance, however additional engineering is required to ensure they are tunable and cover the frequency range required by the design.
A design example follows which describes how the challenges of doing an LC design are overcome, and its strengths leveraged. The design was implemented on a MoSys SerDes implemented in UMC 40-LL technology. MoSys has implemented PLL’s with equivalent functionality and jitter performance in its high-speed SerDes designs at 28, 40, 65nm nodes across three different foundries (TSMC, UMC, Fujitsu).
The first challenge is the programmability. LC oscillators typically don’t have the inherent range offered by ring-oscillator designs. The design integrates programmable capacitances into the voltage-controlled oscillator (VCO). With this design, the PLL achieves the broad frequency range more typical of ring oscillator based PLLs. With a precision bandgap reference, the ultra-low jitter wideband LC PLL and associated DLL delivers a stable bias across temperature, process and voltage. The programmable charge pump current, loop filter resistance, and ripple capacitance provide control over loop bandwidth and peaking. As a result, system designers can fine-tune the SerDes to meet individual application design requirements. As an added benefit, the optimized PLL design also minimizes power consumption.Performance
The PLL’s performance was characterized in silicon with the PLL embedded in an 8-lane (octal) SerDes. All measurements were taken with a 156.25 MHz reference clock at the output of the farthest transmitter using a clock pattern. Measurements include the effects of about one millimeter of clock distribution. In part due to the wide PLL lock frequency range and excellent stability, the MoSys SerDes also delivers low TX jitter and high RX jitter tolerance. Combined, these performance characteristics produce a SerDes with a BER of 10-15
The open loop VCO output frequency range varies linearly (see Figure 2) from 4.5 GHz to 6 GHz with the calibration code. The measurements, made with a 173.43 MHz reference clock, demonstrate the frequency is essentially independent of the four possible bandgap reference (BGR) voltage settings (identified in the keys on the right of the graph). The measured VCO frequency range of 4.5 GHz to 6 GHz matches the design target. This also meets the requirements for both Ethernet 10G BASE-KR and CEI 11G with half rate clocking.
Figure 2. PLL VCO open loop output frequency.
Transmitter or output jitter was characterized at 11.2 Gbps for a PRBS 11 pattern. Jitter performance compares favorably to the best results published to date, measured random jitter RJrms of 524 fs and total jitter TJ of 15.8 ps.
Figure 3 presents the transmitter eye diagram, showing a wide-open eye with a width of 79.1 ps and a height of 349 mV. Tj is 0.17 UI with a margin of 25% based on the CEI-11G specification. Receiver jitter tolerance is equally impressive as shown in Figure 4 for 10.3 Gbps and in Figure 5 for 11.2 Gbps.
Figure 3. Transmitter eye diagram.
Figure 4. Receiver jitter tolerance at 10.3 Gbps
with a PRBS-31 pattern.
Figure 5. Receiver jitter tolerance at 11.2 Gbps Other standards
with a PRBS-31 pattern.
The explosive growth of 40G/100G Ethernet and the broad adoption of the underlying 100G BASE-KR standard inspired the development of this SerDes. This versatile design also more than meets the OIF requirements for the CEI-11G SR, SFP+, XFP, and XFP+ modules.Conclusion
The most critical aspect of designing a high-speed, low BER SerDes is the quality of the PLL. In this paper, in addition to addressing other aspects of the design, such as the transmit/receive portions, we have provided a specific example of an implemented PLL.
The silicon results presented above document the PLL and wideband VCO performance of the MoSys SerDes. The PLL architecture embodied in the SerDes IP competes well with other PLL architectures for multi-gigabit applications.
The 10 Gbps solution exhibits the excellent performance characteristics needed for both 40G/100G Ethernet and CEI-11G solutions. The multi-protocol SerDes includes built-in programmability to address a wide variety of application and transmission environments. This enables full backwards compatibility with a wide variety of current and emerging standards including 1G Ethernet (SBMII), 10G Ethernet (XAUI, RXAUI), 10 Gbps Ethernet standard (10G BASE-R), 40G Ethernet (XLAUI), and 100G Ethernet (CAUI). About the author
Dr. Claude Gauthier joined MoSys in 2007 and is responsible for the Advanced SerDes IP Development group, which designs multi-protocol IP.
Prior to joining MoSys, Dr. Gauthier worked as a designer at ATI and Sun Microsystems. He holds a Ph.D. from the University of Michigan, Ann Arbor.
If you found this article to be of interest, visit EDA Designline
where you will find the latest and greatest design, technology, product, and news articles with regard to all aspects of Electronic Design Automation (EDA).
Also, you can obtain a highlights update delivered directly to your inbox by signing up for the EDA Designline weekly newsletter – just Click Here
to request this newsletter using the Manage Newsletters tab (if you aren't already a member you'll be asked to register, but it's free and painless so don't let that stop you [grin]).