The demand for advanced information services is growing, in terms of the number of users and the services to be supported. Voice and low-rate data services are insufficient for users in a world where high-speed Internet access is taken for granted. The need to support bandwidth-intensive multimedia services places new and challenging demands on cellular systems and networks. Therefore, the International Telecommunication Union (ITU), under the initiative called IMT-2000, came up with a number of standards capable of supporting these requirements.
Many third-generation (3G) wireless standards are based on the wideband code-division multiple-access (W-CDMA) transmission technique. W-CDMA involves spreading signals, each with its own unique sequence, to generate the waveform to be transmitted, and despreading the received waveform to reconstruct the original data. These operations must occur in real-time; thus, they require dedicated hardware. Also, advanced features such as multiuser detector/interference cancellation systems-where multiple users are tracked and intracell interference removed-require very high throughput. The same is true of space-time adaptive systems consisting of multiple antennas to exploit spatial diversity.
These speed requirements eliminate the possibility of using generic digital signal processors, which generally do not have the necessary performance. Because an ASIC is limited in flexibility, a full-custom component implementation is equally undesirable. The lack of an appropriate off-the-shelf chip, combined with a need for hardware-accelerated speed, suggests that the best platform is one that uses programmable hardware.
Another requirement for the platform is the ability to implement complex control algorithms for many of the W-CDMA receiver blocks, such as searching, multipath-tracking and finger-assignment algorithms. Since these algorithms are sequential in nature, they can be efficiently implemented in software. This additional need for software execution suggests that an ideal hardware implementation for the receiver system is one that combines programmable logic and a microprocessor.
In their evolution toward increasingly higher integration, programmable-logic devices have incorporated logic and embedded memory and, more recently, microprocessors. Higher integration affords several well-known advantages, but those of particular interest to the manufacturers of 3G infrastructure equipment include higher performance due to elimination of on-chip/off-chip delays, less board real estate and lower power consumption. The combination of programmable logic and microprocessor capabilities in a single device further allows the designer to explore which functions benefit most from implementation in software or hardware by enabling them easily to test parts of their design in software or in the programmable-logic portion of the device.
Manufacturers of third-generation wireless equipment seeking to differentiate their products by offering various feature sets and levels of performance will do so by making changes in the design of the receiver portion of the W-CDMA modem. This is because transmission techniques are well known and fairly standard, whereas the quality of the reception, as determined by the receiver, for a given product can be a competitive advantage. Specifically, the most complex and critical portion of the receiver is the digital demodulator. The demodulator is responsible for recovering the transmitted message signal after the wireless channel has distorted it, and as a result, its implementation determines the performance of the radio receiver.
In a typical digital demodulator architecture used in W-CDMA applications, the RF front end of the receiver downconverts the incoming signal to an intermediate frequency (IF). The signal is quantized and passed on to the channelizer, which extracts the bandwidth of interest. The wideband signal, which is at chip rate (3.84 Mbits/second), is despread within the despreader block and converted to a narrowband signal, which is at symbol rate (up to 2 Mbits/s). The multipath estimator is used to synchronize the receiver with the transmitter. To increase the capacity of the system, a joint detection and cancellation scheme can be employed that removes the signals contributing the noise for any given user.
The multipath combiner then combines the multipaths, or time-delayed versions of a signal, that the despreader can track. The output from the combiner is then deinterleaved and decoded with a Viterbi or turbo decoder.
Four of the demodulator blocks-including the channelizer, multipath estimator/despreader, multiuser detector and decoder-are of particular importance, since they provide the core functionality.
Signals from the antenna are first processed by the RF chain, which downconverts it to the IF. A wideband A/D converter then quantizes the signal. Within the channelizer, the band of interest is extracted using digital downconversion techniques. The downconversion process typically involves both finite impulse response filters and a numerically controlled oscillator, both of which are available as user-configurable intellectual property (IP) for programmable logic.
One advantage the PLD user has over the user of application-specific standard products (ASSPs) is the flexibility to precisely define the number of filter coefficients, allowing greater control over co-channel interference. With PLDs, the developer may also define the multirate parameters in the channelizer, allowing more flexibility in the rest of the demodulator architecture.
To despread a user's wideband signal, the receiver needs to synchronize its pseudo-noise code with the transmitter. Also, to leverage from the multipath environment, the delays, or phases, of the different multipaths need to be estimated. The multipath estimator and despreader provide this operation, called acquisition and tracking. There are a number of ways to implement the acquisition scheme.
For example, in the double-dwell method, the received code and the locally generated code are offset a fraction of a chip and correlated for a predetermined period of time to see whether the two sequences are in phase. If they are, a longer examination period is used to ensure that this was indeed the correct phase position; otherwise, the local code is stepped to another phase position, and the process is then repeated.
Given the iterative nature of the process, the control part of it can be effectively implemented in software and executed by the embedded processor. The correlator portion can be implemented in the logic portion and can work in tandem with the processor executing the control routine. If the two reside in the same device, as with an embedded processor PLD, for example, delays are minimized compared with those in architectures that utilize a discrete processor.
In order to track the code once it has been acquired, an "early-late" gate device is used. In this scheme, advanced and retarded code replicas of the internally generated code are correlated with the received code. The outputs of the two are added and are used to drive the voltage-controlled code generator. In this manner, the internally generated code closely tracks the received code.
The capacity of a W-CDMA system is interference- limited; that is, every user acts as interference for every other user. The more resistant the system is to interference, the more users can be served. Multiuser detection (MUD) techniques, in which multiple users are tracked and their signals removed from each other's signals, reduce the effect of multiple-access interference and increase system capacity. MUD schemes are actively discussed in academic circles, because MUD is a hardware-intensive process that requires significant processing, making an efficient implementation a challenge in its own right.
Most MUD schemes involve tracking multiple paths of multiple users, estimating their signal strengths and then regenerating them before they can be removed from other users' signals. In generating the interference signal, matrix inversion may be involved, which is quite processing-intensive and difficult to perform in real-time. Generic DSPs, which have limited parallel-processing ability, cannot meet these requirements, but programmable logic can. For example, a fast Fourier transform in a PLD can perform a real-time matrix inversion, and IP is available for PLD users for exactly this task.
One promising method for further enhancing the capacity of a system is the adaptive-antenna technique. In this technique, the spatial diversity of the environment is leveraged so that the spatial profile-that is, the distribution of signal energy in space-of the desired user and interfering users are exploited to improve reception. The technique involves creating a beam with maximum gain in the direction of the desired user and a null in the direction of the interfering users. Using this technique, the desired signal is amplified, while the interfering signals are attenuated before the signal is decoded.
As the name implies, the adaptive antenna technique employs a scheme that attunes itself to the constantly changing environment and positions of users. It involves a feedback loop in which the weighting factor for the different antenna elements is constantly adjusted for optimal
signal detection, for example, the steepest-descent method, which minimizes the mean square value of the error function. Given that the update to the weighting factor happens close to symbol rate and is mathematically complex, it is better-suited for implementation in software rather than in hardware. The weighting of the antenna signals and the despreading operation, which are done at multiples of chip rate, can be implemented in logic, while the evaluation of the weighting factor can be done in software running on the embedded processor.
Turbo coding is an iterative decoding scheme based on convolutional codes in which the data is encoded twice. First, a recursive convolutional encoder codes the data, and then a second encoder codes an interleaved version. (Interleaving is the process of rearranging a block of data bits in a pseudorandom fashion.) Recently, Turbo codes have been attracting much attention because of their relatively large coding gain with reasonable computation complexity. Compared to other methods, for example, convolutional encoding and Reed-Solomon codes, they come closer to meeting the Shannon limit of the theoretical maximum information transfer over a noisy channel.
The Turbo decoder consists of two individual decoders and an interleaver/deinterleaver. During the decoding process, correction values derived from decoding the data from one encoder can be used to aid the decoding of the data from the other. The decoders exchange and improve correction values several times before applying the corrections to the output. The Turbo encoding/decoding scheme specified in the W-CDMA standard is significantly complex. The internal decoders are typically based on soft input/soft output Viterbi decoders or maximum a posteriori decoders, which must carry out many operations in parallel to achieve the required data rate of 2 Mbits/s. The interleaver, as specified for the W-CDMA standard, is also very complex to implement due to its dependency on prime number sequences and its ability to interleave differently for all possible block sizes.
As with other components of the demodulator, the complexity of the Turbo decoder and its performance requirements under the W-CDMA standard preclude its implementation with DSP processors. Programmable logic can, however, meet the 2 Mbits/s specification, and existing Turbo encoder/decoder IP for programmable logic can achieve even higher data rates when used in multiple instances.
The 3G, W-CDMA-based standard has the potential to evolve mobile communication to the next level where high-speed data and multimedia capabilities become a reality for the masses. In order to enable this, systems with advanced techniques such as multiuser detection and adaptive antenna need to be developed. A platform that is flexible, and at the same has the right features required for advanced systems, is key. One such platform is that which combines programmable logic and embedded processor capabilities. Currently, many developers of 3G equipment are using PLDs to prototype their systems. Given the integration, flexibility and time-to-market advantages of embedded processor PLDs, as well as the option to migrate to mask-programmed versions of these devices at one-tenth the cost, increasing numbers of 3G systems will also benefit from this latest generation of programmable logic.