Digital Signal Processing With or Without a DSP
You don't need a DSP chip to do digital signal processing. Here are some alternatives that just might meet your needs.


Spectra


by Don Morgan
In the last few issues, we’ve examined some popular digital signal processors from Motorola, Analog Devices, and Texas Instruments. In addition to standard arithmetic architectures, these devices each have special features that make them valuable for particular applications, as well as general digital signal processing. Digital signal processing may embody any operation on a data sequence. It may be a simple AND operation, a two or threedimensional mask, a polynomial filter, such as an FIR, or it may be a transform. All of the DSPs we have discussed offer the arithmetic equipment that makes certain mathematical procedures easier, such as an optimized multiply/accumulate. One model adds refinements for automating transform processing, some include the hardware to easily accept serial A/Ds, and one actually possesses a dual pipeline. As a rule, the construction of these DSPs makes arithmetic computation more efficient and faster to perform. Most of them offer greater overall speed than the common microprocessor or microcontroller. The previous enhancements make the parts valuable, but it should not be construed by the number of such devices or the name or the arithmetic architecture that digital signal processing can only be done on a DSP. Digital signal processing can be done on any microprocessor or microcontroller. In this column, we’ll examine these and some alternative methods of performing DSP operations. These alternative methods are becoming more important, in fact, for both reduction of cost and increase in speed and efficiency. In this issue, we will look at PLDs and FPGAs, as well as ICs built expressly to implement some particular aspect of digital signal processing. Such parts can be put together with standard microcontrollers to create sophisticated systems.
DSP components
An example of such a case might be found in systems employing A/Ds as an interface to the real world. The analog system that prepares the signal for the A/Ds not only communicates the signal to the A/D, but it will also carry any offsets in the previous circuitry to that A/D. Not only that, the A/D itself can add certain products to the signal that can require low passing before the data can be used. In many cases, either or both of these elements prove clearly undesirable and actually deleterious. In audio processing, for example, any DC bias on the input signal can offset compressors, filters, and amplifiers so that the signal is attenuated and distorted. Sliding filters that use signal magnitude to determine the cutoff frequency can be fooled by such bias and never reach the proper operating point. In a case such as this, it would be nice to have a highpass filter between the A/D and the processing elements. If you have a DSP in the system and it has the bandwidth remaining after its application processing, you may be able to add the code to remove these products there. However, if you don’t already have the DSP or there isn’t enough room or time left for the filter, you may want to choose another course. (As a side note, some manufacturers are beginning to include highpass filters in their A/Ds to help control these offsets.) The other problem involving imaging can be eliminated with the appropriate lowpass filter placed after the A/D converter. COTS parts are available for these tasks from several companies, including Harris Semiconductor. [Note (5/11/00): Harris Semiconductor has changed its name to Intersil Corporation.] The nice thing about these parts is that they are usually selfcontained and require only a small understanding of signal processing, and they don’t generally require programming. They are very useful for lowcost audio and video applications. Other applications, such as radio, require parts such as these because DSPs are not fast enough. Better performance can be had in terms of signal to noise by moving FIR (linear phase) elements as far up in the IF chain as possible. Also, halfband FIR filters are available for use in quadrature splitting for the removal of unwanted sidebands. Many companies offer discrete parts that will perform some of these individual tasks for radio. Harris Semiconductor offers a relatively full line of DSP components that can be used individually or in combination to perform many functions. Among their products are FIR filters, quadrature decoders, multipliers, halfband filters, numerically controlled oscillators, histogrammers, video image filters, and convolvers. As you can see, it’s possible to put together a DSP system that relies on hardware with a minimum of programming. For more information concerning these components, along with application notes, visit http://www.intersil.com/sitemap.asp. In addition to their DSPs, Analog Devices has an interesting series of nonDSP products featuring DSP functionality, including a sample rate converter and a video codec. The sample rate converters from Analog Devices — the AD1890, AD1891, and AD1892 — allow two digital systems with asynchronous clocks to share data almost seamlessly. The sample rate of the input, usually AES/EBU or SP/DIF, clocks the input, while the receiving system clocks the output side. A complex set of decimating and interpolating FIR filters within the chip move the data from one sample clock to the other with low values of jitter. These parts have a high sample clock range and the ratio between the sample clocks can also be quite large. Applications for these parts include digital mixing consoles and digital audio interface, CDR, DAT, DCC and MD recorders, routers, switches, broadcast equipment, and so on. The video codec, ADV601, is a new part that incorporates wavelet technology in a chip that allows for the realtime compression of video signals for transmission or storage. The chip, in conjunction with a DSP, provides for precise compressed bit rate control, with the DSP setting the bin widths for the compression. The compression ratios range from visually lossless to 350:1. It will interface to a wide variety of equipment, including CCIR656, and has an eight, 16 or 32bit host interface. The applications for this chip include network and Internet video, editing, video capture, remote CCTV, digital cameras, archival systems, and so on. For more information on these parts, visit Analog Devices’ Web site at: www.analogdevices.com.
FPGAs and PLDs
Fortunately, several FPGA and PLD manufacturers, including Altera and Xilinx, offer prepared software for their devices. Visit their Web pages for information on the offerings of either of these two manufacturers: www.altera.com/common/sitemap.html and www.xilinx.com/products/technology/dsp/index.htm. These functions are intellectual property; they are usually sold as macros or cores and may be expandable or customized for your particular application. They are placed in the target device and compiled along with whatever other program you desire. The functions these companies offer include forward and inverse FFT functions; DCT; FIR filters; floatingpoint adders, dividers, and multipliers; JPEG encoders and decoders; Laplacian edge detection; adaptive filters; and oscillators.
Distributed arithmetic
Distributed arithmetic is not a new technique, but one that is not in common use for mainstream processors. It may not seem obvious at first glance, but it is a simple mix of Boolean logic and algebra. To appreciate the mechanism here, let’s look at some familiar forms. First, recall the expression for linear time invariant (LTI) systems: (1) This formula actually describes only a single instance in the infinite sum of time: y [ n ]. A _{k }_{} is the k th coefficient of the polynomial and represents the system impulse response, and x _{k }_{} is the k th input sample at time n . Each output is equal to the sum of k products. This technique converts the input sample into an effective address to a table of scaled coefficients that are then summed. Distributed arithmetic replaces the sum of products with a table lookup method. We use binary arithmetic, which is expressed, as values in any base are, with the expression for a polynomial: (2) Here,  x _{k }_{}_{0 } is a sign bit, and x _{kb }_{} is the value of the bit position at 2 ^{ }^{b }^{}. This equation simply states that the binary fraction we are representing is a sum of the contributions of each of the powers of two necessary for the required precision. In other words, the binary number: 0.11010011 is equal to the value of the sum: (3) One nice thing about base two is that each power can only have a value of either one or zero, which means that we can turn our original equation into a series of sums of each bit position multiplied by the scaled coefficients of the impulse response. Since the bits can only assume a value of zero or one, the multiplication is really a Boolean AND operation. We can make this clearer by substituting our expression for a binary fraction in Equation 2 back into the equation for an LTI system (Equation 1): (4) And now we expand the equation: (5) What you see here is that each bit is used as a gating function for summing scaled versions of the coefficients at each power of two. To put that another way, each bit of the input sample (variable) is ANDed with all the bits of the particular scaled coefficient and that result is summed. A lookup table can be implemented that is addressed by the bits of the input sample. The contents of that table would be the sums of the scaled coefficients indicated by that address. Thus, an LTI can be implemented with addition, subtraction, and simple scaling (arithmetic right shifts). The second form of this technique is more like an extension. It is also a table lookup technique involving partial products. In the case of linear phase FIR filters, where the coefficients are symmetric around the center values, we may fold the input data word about the center and add the symmetric taps before mulitplying them by the coefficients. We use the input sample bits as described in Equation 2, but we fold on the center, and the corresponding bits are summed and used as addresses for partial products. Since we know the coefficients, a table can be made that will form the products of the subbit fields in the input and the coefficients. These partial products would then be summed to produce the complete product, which is again summed with the other products to become the sum of products. The advantages of these processes are their speed and ease of implementation in an FPGA.
Specialized DSPs
Motorola and Crystal Semiconductor have a series of DSPs that are tailored for audio. These two companies offer different cores but both feature DSPs with AC3, Prologic, and DTS algorithms as masked ROMS onboard. For highend transform processing, Sharp Semiconductor offers the Butterfly DSP that is tailored to performing fast Fourier transform processing. They also have DSP chipsets available.
Standard microcontrollers
I hope I’ve made it apparent that the DSP technology is an abstraction and not devicedependent. The choice of the device is applicationdriven. Don Morgan is senior engineer at Ultra Stereo Labs and a consultant with 25 years experience in signal processing, embedded systems, hardware, and software. His most recent book is Numerical Methods for DSP Systems in C . He is also the author of Practical DSP Modeling, Techniques, and Programming in C.
