The voice-over-packet market continues to grow and evolve. From the residential gateway up to central office processing, the market is robust and open to a wide range of options. The market is also flooded with multiple standards for codecs, echo cancellation and overall voice channel processing. Channel density-the number of voice channels that can be handled by a given solution-is a critical issue for evaluating a DSP solution for products. However, there is significant confusion, often intended, in determining real vs. theoretical channel density on a given DSP core. So, what are the real criteria and issues related to determining channel density?
Voice-over-packet systems are built to process multiple channels of full-duplex digital voice that flow to and from a packet-based network. For example, in a two-way voice channel employed in a voice-over-packet system several signal-processing blocks are involved. The two most important ones are voice codecs and echo cancellation devices.
A wide variety of voice codecs is employed in voice-over-packet systems, all of which serve to reduce the amount of bandwidth necessary to convey the voice data. Not all voice codecs are created equal. There is a wide difference in their complexity and quality, defined in a voice-over-packet system as a combination of the sound quality and the delay budget the codec introduces in the system throughput. The International Telecommunications Union standardizes many of the codecs used in voice-over-packet systems. Codecs can be grouped according to the relative data rates, complexity (in terms of processing power), delay budgets and sound quality.
Because of the increase in round-trip delay for voice to travel through a packet-based network vs. the circuit-switched network, it is necessary to reduce the residual echo generated in the local loop from entering the packet network. Most voice-over-packet systems employ an implementation of the ITU G.168 standard for digital echo cancellation. The amount of processing power necessary to perform the echo cancellation is tied directly to the length of the tail in the echo canceler's adaptive filter. The length necessary is a function of the distance between the voice-over-packet system and the phone in the local loop.
For residential gateway applications, the tail length can be comparatively short (16 to32 milliseconds, or 128 to 256 taps) because of the local loop's being confined to a residence or small office building. For other applications, such as interexchange-carrier hop-off service, the tail length must be comparatively longer (64 to 128 ms, or 512 to 1,024 taps), since the local loop spans a metropolitan area.
Echo cancellation involves more than just an adaptive filter. Other signal-processing functions, such as double-talk detection (when both parties in the call are speaking), nonlinear echo suppression and comfort-noise generation, are also necessary to have a complete and stable solution that performs well in the network.
Beyond voice codecs and echo cancelers, a number of other signal-processing functions must be run in the two-way voice channel:
- Tone generation and detection. Voice-over-packet systems must continue to detect and generate the public switched telephone network in-band signaling using dual-tone multifrequency (DTMF) and call-progress tones.
- Voice-activity detection. It is necessary to accurately assess the presence of voice on the channel to be able to manage network bandwidth. Many systems send the stream of packet voice only when voice is detected.
- Caller ID generation. Subscribers to voice-over-packet services expect to receive the same complement of calling features that they have with their circuit-switched service. That makes it necessary to provide such functions as caller ID to continue the level of expected service.
- Configuration and control. All of the features of the two-way voice channel need to run in a scalable multichannel framework. It is necessary to configure and control each of the signal-processing components that are running across all of the channels.
A key requirement of this framework is the ability to run different codecs and to independently report detection events and generate tones on independent channels.
The bottlenecks and constraints on the design of a voice-over-packet system can be examined at two levels. The first is the DSP performance necessary to do the core signal processing of the two-way voice channel. The second is the system data throughput associated with moving the multiple channels of voice data. Many algorithms used in processing the two-way voice channel are dependent on the multiply-accumulate operations prevalent in digital filtering. When looking at the suitability of a DSP core for such processing, it's critical to ask how many MAC operations the core can perform per cycle.
The answer will depend on two main factors. The first is obvious: How many MAC units are in the DSP? The second is subtler and often overlooked: Can the DSP read and write memory with sufficient bandwidth to keep all of the MAC units working in parallel?
The latter capability is critical. In determining the suitability of a DSP core for two-way voice channel processing, you will want to understand how many MAC operations can be sustained per cycle.
Another critical feature of the DSP platform in this application is the efficient management of data buffer addressing. The DSP core should have low-overhead, hardware-assisted capabilities for circular and indexed addressing. This is especially crucial in producing an efficient multichannel framework.
Beyond these performance metrics of the DSP core are critical constraints and bottlenecks to consider in the moving of the voice data in and out of the system. Because the DSP core is best-suited to run the processing algorithms of the two-way voice channel, you would rather not have it also involved in moving the multichannel voice data to and from the packet and telephone networks. On the telephone network side of the system, it is effective to have a custom hardware block that can manage the shuttling of voice data to and from buffers in the DSP core memory space and a time-division multiplexed bus. On the packet network side of the system, it is effective to have the DSP core work in conjunction with a smart direct memory access controller that is capable of shuttling voice packets to and from the DSP core and the network.
If importing the applications source code (or intellectual property) from the DSP vendor, scrutinize the performance data for the applications running on the DSP. Are the provided Mips estimates averages or maximums? Are the digital echo canceler (DEC) performance numbers provided when the DEC is converging? Do the DEC Mips estimates include the double-talk detector, nonlinear processor and comfort-noise generator? What are the instruction and data memory requirements per channel?
Another performance concern is how well the DSP handles overhead operations, such as control code and internal movement of data. That overhead drastically reduces the system channel density.
The performance of a voice-over-packet system and the channel density that it can support are not just functions of a particular algorithm's performance on a DSP core.In fact, the benchmarking of a particular algorithm used in the two-way voice channel on a DSP core without reference to the system solution can be misleading. We have found that it is much more accurate to benchmark a voice-over-packet system using typical multichannel system scenarios as a gauge of performance.
Channel density in a voice-over-packet system is not a function of algorithm performance on a particular DSP core alone. At the system level, the movement of channel state information and the orchestration of multiple signal-processing functions in the two-way voice channel are key.
System-level benchmarking in a multichannel environment provides a more accurate view of performance and helps designers choose the best voice-over-packet solution.
See related chart