Current and future generations of wireless cellular systems feature heavy use of Remote Radio Heads (RRHs) in the base stations. Instead of hosting a bulky base station controller close to the top of antenna towers, new wireless networks connect the base station controller and remote radio heads through lossless optical fibers. The interface protocol that enables such a distributed architecture is called Common Publish Radio Interface (CPRI). With this new architecture, RRHs offload intermediate frequency (IF) and radio frequency (RF) processing from the base station. Furthermore, the base station and RF antennas can be physically separated by a considerable distance, providing much needed system deployment flexibility.
Typical advanced processing algorithms on RRHs include digital up-conversion and digital down-conversion (DUC and DDC), crest factor reduction (CFR), and digital pre-distortion (DPD). DUC interpolates base band data to a much higher sample rate via a cascade of interpolation filters. It further mixes the complex data channels with IF carrier signals so that RF modulation can be simplified.
CFR reduces the peak-to-average power ratio of the data so it does not enter the non-linear region of the RF power amplifier. DPD estimates the distortion caused by the non-linear effect of the power amplifier and pre-compensates the data. CFR and DPD protect the data, mitigate the effect of power amplifier non-linear distortions, and widen the operation range. However, CFR and DPD are computationally intensive and need to support very high throughput streaming data. Field programmable gate arrays (FPGAs) are an ideal platform for computationally-intensive RRH designs. Abundant hardened multipliers on an FPGA provide speed, area, and power reduction for highly arithmetic RRH implementations.
More importantly, many wireless standards demand reconfigurability in both the base station and the RRH. For example, the 3GPP Long Term Evolution (LTE) and WiMax systems both feature scalable bandwidth. The RRH should be able to adjust – at run time – the bandwidth selection, the number of channels, the incoming data rate, among many other things. On the other hand, as FPGAs evolve with higher density, larger numbers of hardened multipliers, and more complex embedded processors, it has become possible to support multiple wireless standards in a single device. For instance, a US wireless vendor may support both WCDMA (UMTS) systems and 3GPP LTE systems from a single RRH card. A wireless operator in China may service the same location with both LTE and TD-SCDMA networks. With a multi-mode single device RRH solution, network providers can significantly reduce cost, power, and maintenance efforts in RRH applications.
With the requirement to design a multi-mode RRH on a single device, let’s examine a few system planning issues that should be considered in RRH designs. Factors, such as protocol support, number of antennas and carriers, as well as FPGA clock rate, affect how compact and efficient an RRH design will be. In this article we will focus on CPRI and DUC configuration.
RRH system model
Typically, a base station connects to a RRH via optical cables. On the downlink direction, base band data is transported to the RRH via CPRI links. The data is then up-converted to IF sample rates, preprocessed by CFR or DPD to mitigate non-linear effects of broadband power amplifiers, and eventually sent for radio transmission. A typical system is shown in Figure 1.
Figure 1. Block diagram of a typical RRH System
The CPRI specification is an initiative to define a publicly available specification that standardizes the protocol interface between the radio equipment control (REC) and the radio equipment (RE) in wireless base stations. This allows interoperability of equipment from different vendors, while preserving the software investment made by wireless service providers. Figure 2 illustrates a CPRI interface.
Figure 2. CPRI Interface
When designing a RRH with a CPRI link, there are a few system level decisions that must be made regardless of the actual hardware implementation of the CPRI interface:
- Determine the wireless standard being supported and thus what CPRI mapping method required
- What the number of antenna-carrier interface will be required per CPRI link
- The CPRI line rate
- The CPRI output data format
Each of these decisions will be discussed in the following sections.
CPRI support for multiple wireless standards
CPRI Specification v4.2 is based on the Universal Mobile Telecommunication System (UMTS), the WiMAX IEEE Std 802.16-2009, and the Evolved UMTS Terrestrial Radio Access (E-UTRA), with the possibility of supporting other wireless standards in future revisions of the CPRI specifications.
The three mapping method described in CPRI v4.2 targets only the three standards listed above. However, many CPRI IP vendors provide some flexibility in supporting customer defined mapping modules, which means it may be possible to support additional wireless standards.
CPRI frame structure
A basic CPRI frame has duration Tc=1/fc=1/3.84MHz = 260.41667ns. The basic frame structure is shown in Figure 3, where T is the word length given by (Line Rate in Mbps)/76.8, so it varies with the line rate. For a 3.072Gbps line rate, for example, T is 40. One hyperframe is made up from 256 basic frames, and a 10ms CPRI frame consists of 150 hyperframes.
A basic frame consists of 16 words, where the first word of each basic frame is a control word. The other 15 words are used to carry user plane data (SAPIQ
) as shown in Figure 2. The user plane information is presented in the form of in-phase and quadrature base band data, or IQ data. The frame structure illustrated in Figure 3 dictates the amount of user plane data a particular line rate can support. The next subsection is how to select line rate based on a user application.
Figure 3. Basic frame structure for different
Choosing the CPRI line rate
CPRI line rates. T is the word length and
varies depending on the line rate .
The basic frame structure in Figure 3 illustrates the amount of user plane data a particular line rate can carry. The following equation calculates how many data bits are available in a CPRI basic frame to carry IQ data:
The factor 15/16 accounts for the fact that out of the 16 words in a basic frame, 15 are data words. The factor 8/10 accounts for the 8B10B encoding that the CPRI specification requires in the Tx direction. Based on 8B10B, only 80% of the CPRI line capacity is used to transmit non-encoded data, with the other 20% being used on encoding redundancy.
Based on Equation (1), the number of IQ data bits per basic frame as a function of CPRI line rates is listed in Table 1.
Table 1. Number of IQ bits per basic frame
as a function of CPRI line rates.
The minimum CPRI line rate should be able to support a wireless system’s total bandwidth. That is, the amount of IQ data that comes across the CPRI link between the base station and the RRH during a 260.67ns period, must not exceed the number of IQ bits listed in Table 1for a given line rate.
The following example considers a single sector, mixed bandwidth LTE FDD system with two transmitting and two receiving antennas. Across a 20MHz allocated bandwidth per antenna, a 10MHz LTE carrier runs concurrently with two 5MHz LTE carriers.
In this example, a total of (1 + 2) x 2 = 6 antenna-carrier pairs, where the factor 2 is to account for 2 antennas on either the transmitting or the receiving side. Assume both I and Q data are 16-bit wide. The number of bits the 6 antenna-carrier pairs carry during a 260.67ns basic frame can be calculated as [Sample Rate (in MHz)/3.84] x 16 x 2 x [Number of AxCs]. In this example, total number of IQ bits from the application is:
30.72/3.84x32x2 + 7.68/3.84x32x4 = 768.
Compare 768 with the total number of IQ bits that a line rate supports shown in Table 1, where 4.9Gbps is the minimal line rate required for this application. Alternatively, multiple parallel CPRI links can be used to support high throughput high bandwidth applications. In most cases, however, having multiple parallel CPRI links complicates data path synchronization tasks in the actual implementation. It also requires multiple optical cables between REC and RE, which adds to the system setup and maintenance cost.
CPRI output data format
Although different users may implement CPRI and subsequent DUC designs differently, it is common that a framer or data re-formatter is needed between CPRI and DUC modules. A DUC is designed to maximize hardware reuse due to its computation complexity. To share the multiplier resources efficiently in the FIR filter chain, the input multi-channel data to the DUC usually needs to be arranged in a certain pattern. The data pattern should allow AxCs to access the FPGA logic and multiplier resources in a time division multiplexing (TDM) fashion. The framer or format converter design depends on the CPRI output data format and required DUC input data format. It is commonly implemented using the FPGA on chip memory.
A typical DUC and DDC system for a single standard RRH is shown in Figure 4. Base band data is first filtered by a FIR channel filter, then upsampled. A final cascaded integrator and comb (CIC) filter provides a variable rate change. A CIC filter uses only addition and subtraction to realize low pass filtering, without resorting to multiplications. In multiplier hungry DUC designs, it is a highly hardware-friendly solution. The only drawback is that a FIR compensation filter is needed to alleviate the pass band droop problem in CIC filters . A numerically controlled oscillator (NCO) generates digital sinusoidal waveforms and a complex mixer is needed to provide IF stage mixing.
Figure 4. Illustrative block diagram of a single mode
DUC and DDC on an FPGA.
When planning a DUC module, the biggest challenge is the filter design optimization. Needless to say, finding the optimal filter coefficients and filter order that meet the wireless transmission spectrum mask of various standards is a challenge. However, how the multiple filter cascades are partitioned also has a great impact on resource and power utilization. Similarly the IF carrier mixing may also be broken down into stages. When and where the data and carrier mixing should happen affects the resource utilization as well. In a RRH supporting multiple wireless standards, it is particularly important to reuse as much resource as possible; otherwise DUC itself can take up significant amount of logic and multiplier resources on the FPGA.
Choosing the FPGA clock rate and IF sample rate
Wireless applications are multi-channel applications because both inphase and quadrature signals are needed, across the entire data path. Multiple antenna (MIMO) configuration in all leading wireless standards such as LTE, WiMAX, TD-SCDMA require that even more data channels are supported simultaneously. As a result the FPGA logic must operate at the fastest rate attainable in order to process as many data channels as possible, using the same set of resources. To lower cost, hardware sharing has to be maximized and that also means selecting a higher FPGA clock rate.
In DUC applications, FPGA logic often runs at a clock frequency that is an integer multiple of the data path sample rate. Doing so enables most efficient resource sharing via time division multiplexing (TDM). Furthermore, data is aligned with clocks, therefore control logic and clocking schemes are simpler.
More recently the LTE standard has become the prominent candidate for next generation mobile broadband systems. As a result modern multi-mode RRH systems most likely will support at least the LTE specification. LTE is evolved from UMTS or Wideband CDMA. Wideband CDMA has chip rate of 3.84MHz, and LTE sample rates for all bandwidth selections are integral multiples of 3.84MHz. Table 2 shows the sample rate or clock rate of LTE RRH as an integral multiple of 3.84MHz. It is quite common that FPGA clock rate and IF sample rate are chosen from Table 2.
Table 2. List of sample rates as integral multiple of 3.84MHz.
Because LTE is evolved from WCDMA (UMTS), WCDMA will be supported effortlessly in most RRH systems targeting LTE. Other major wireless standards such as WiMAX, Multi-carrier GSM, TD-SCDMA, and CDMA2000 can also be supported in same DUC data path using sample rate converters. That is, the front end filtering in WiMAX, MC-GSM, TD-SCDMA and CDMA2000 systems need to convert the input sample rate to a value in Table 2. Doing so allows subsequent interpolation and IF carrier mixing to be shared with LTE data.
Among the possible FPGA clock rates, 245.76MHz is the most prevailing choice in modern high end FPGAs. It is fast enough to provide efficient and adequate resource sharing and low enough to be easily achievable. Since it is 64 times the base sample rate 3.84MHz, it is also possible to replace the traditional FIR and CIC filter combination with highly efficient half band filter cascades . An interpolation half band filter raises the data sample rate by a factor of 2, where only half of the filter coefficients are non-trivial. In addition, half-band filter cascades typically require fewer taps (i.e. smaller filter order) than non-half band interpolation FIRs. As a result, the overall required multiplier count in the DUC may be fewer, although the actual design optimization needs to be evaluated on a case-by-case basis.
As technology progresses, future generations of FPGAs will feature more abundant hard multipliers and much faster logic speed. It is therefore possible and even desirable to move FPGA clock rates and IF sample rates even higher, such as 491.52MHz.
Design space exploration
A properly designed DUC module needs to meet the transmission spectrum mask requirement of the wireless standards it supports. In addition, error vector magnitude (EVM) requirements also impact the filter coefficients selection. Regardless of the design criteria, multiple design iterations are commonly required.
The major areas of exploration include multiple stage filter partition and IF carrier mixing. Often along the data up-conversion filter chain, a very long filter with tight transition bandwidth requirements can be broken down into two or more filter cascades. Each new filter in the chain has relaxed cutoff frequency or transition bandwidth requirement. The total filter length may still be smaller than the original filter. In other cases, a half-band filter cascade can replace a traditional FIR filter chain to significantly reduce resources. This explains how and when the half band filter option can be selected in the previous section.
Intermediate frequency (IF) mixing using NCOs and complex mixers can also be broken down into stages. The first stage complex data mixing modulates IQ data onto low IF frequencies and sums them together. As a result, subsequent up-conversion filters handle fewer channels. The second stage mixing further modulates IF data to the final IF carrier frequency. The resource saving from filters between the first and second stage mixing can exceed the cost in implementing two stages of mixing. However the tradeoff needs to be evaluated based on actual system configuration.
In this article we discussed a few system level planning issues when designing a remote radio head system on an FPGA. CPRI is the interface protocol that enables the distributed architecture in base station. The number of MIMO antennas, the wireless standard being supported and the bandwidth selection all play a role in determining the minimum CPRI line rate requirement. The digital up and digital down converters interface the CPRI module on the RRH, and a format converter is often needed as glue logic. The DUC and DDC are designed to maximize resource reuse. Proper selection of FPGA clock rate, filter design partition and IF mixing design all play important roles in resource optimization.
 Common Public Radio Interface (CPRI) Interface Specification, v4.2, Sept. 29, 2010.
 Eugene B. Hogenauer, “An economical class of digital filters for decimation and interpolation,”
IEEE Transactions on Acoustics, Speech and Signal Processing, pp. 155-162, April 1981.
 Fredric J. Harris, Multirate Signal Processing
, Prentice Hall, 2004.
About the author
Xiaofei Dong is a Senior Applications Engineer at Altera Corporation. Xiaofei has more than eleven years of experience in designing and implementing wireless systems and DSP algorithms.
Since joining Altera in 2006 as a senior applications engineer, her responsibilities have included creating customer-facing design examples, reference designs and applications notes to assist with the successful design-in with Altera products, as well as providing technical support for customers relating to intellectual property (IP) and related products.
Xiaofei holds a Ph.D. degree in Electrical and Computer Engineering from University of California, Davis, and a M.S. degree from University of Alberta, Edmonton, Canada.