Noise and echo can have a big influence over the performance of communication systems. For a voice activated telematics system, a recognition accuracy degraded to 90% by automotive noise will render it useless to dial a 10 digit phone number. In a fire engine, poor communication with the control room due to the screaming of the siren can result in a slower or confused response, potentially resulting in tragedy. For a Formula 1 driver and the pit crew, engine noise could easily wipe out their chance of a podium finish, if there was a misunderstanding regarding strategy. Although it is taken for granted by most of us, technology for handling noise and echo is essential for many of today’s communication systems.
Good System Design – An Important Start
Audio engineers know that many of the problems that can cause noise and echo could be minimized at the initial design stage. A good example that illustrates some of these design considerations is a hand free communication system in a car.
Figure 1: Omni-directional microphones offer a wide coverage area, but also pick up noise from all directions.
Figure 2: Substituting a directional microphone reduces noise while improving the SNR of speech from the driver.
Simply using a uni-directional microphone pointing towards the diver, rather than an omni-directional microphone, can eliminate a great deal of road noise, sound from the music system and passenger noise.
Figure 3: Placing the microphone too close to the speaker causes more noise and echo problems. Acoustic directionality of the speaker and micophone can also mimic physical placment problems .
Figure 4: Proper selection of speaker and microphone installation simplifies the noise control and cancellation problem.
The location of the microphone relative to the loudspeaker can also have a considerable impact on the performance of the hand-free communication system, as the microphone will pick up the sound from the loudspeaker to a greater or lesser extent depending on its position and orientation relative to the loudspeaker.
Developing a good analog filter for the signal input from the microphone and choosing a higher sampling rate for its conversion to the digital domain are important first steps in the electronic design. The analog filter contributes to reduced electrical interference and microphone buffeting. Higher sampling rates offer a broader bandwidth, capturing more of the voice signal frequencies and therefore a better quality of voice for the system to handle.
Figure 5: Typical bandwidth limiting for traditional telephone communications cuts off around 4kHz. This is consistent with legacy wireline telephony.
Figure 6: Extended bandwidth captures more of the speech signal improving fidelity and the communications experience.
Amplification and management of the signal strength is a critical component of ensuring quality; too big and the signal will be distorted through clipping, too small and it will be embedded in the noise of the system and will be difficult to extract.
Figure 7: Optimum signal amplification results in the full range of discrete values being made available for representing the signal.
Figure 8: Too little amplification decreases the SNR.
Figure 9: Over amplification causes the signal to be clipped and in some cases may make the signal unintelligible.
A strong, clear, undistorted, voice signal is an essential starting point to achieve good voice quality. Once this is in place, the processes of noise reduction and echo cancellation can be applied for further enhancement.
The Drive Through – A Simple System
The communication system at your local Drive Through restaurant is a great example of a voice system that could employ noise reduction software. When you speak into the ‘microphone post’ to place your order, the system ideally removes background traffic noise, and even the noise from your own car making order errors less likely and order taking faster.
Figure 10: Noise reduction in a drive through restaurant system poses few opportunities for feedback. Automotive noise reduction makes it easier for employees to understand orders.
You will see from the diagram above, that the system contains a DSP (Digital Signal Processor). This processor provides the platform for the speech enhancement algorithms. Today this DSP will often be a Texas Instruments’ or an Analog Devices’ device. As we move forward, it is equally likely to be a Bluetooth device or some new form of WiFi processor. In some instances, these complex and highly refined voice algorithms are integrated into dedicated silicon and sold to developers as a discreet component performing all of the functions illustrated in the diagram and more. Alternatively, in more complex systems, these algorithms may be run as a task on operating systems such as OSE, Linux , Windows or QNX.
The Referenced Noise Filter
The Referenced Noise Filter is an example of a noise reduction technique that can target a specific type of problem where the noise-producing source is man-made and accessible. If we use the example of the fire engine in the introduction, the problem faced here is the constant background sound of the fire engine’s siren. When the engine driver talks into his microphone, it will be picking up his voice as well as the sound of the siren.
Figure 11: The referenced noise filter takes a signal (direct tap) from the source of the noise (the siren) and use this as a reference to enable the filter to target the noise that need to be removed. Using this algorithm on the DSP can achieve profound results, resulting in a 90% reduction in the unwanted sound.
Voice Recognition Enhancer
Voice recognition is still very much an emerging technology. One of its key applications will be for the control of in-car telematic systems. Using your voice, you’ll be able to instruct your navigation system, dial up your favorite restaurant to book a table, maybe even open the sunroof and select your favorite music.
One of the biggest challenges for these systems has been the over-coming of noise. A 90% voice recognition accuracy is close to useless (one digit in a 10 digit phone number will always be wrong). The challenge is that as the car accelerates, wind and road noise starts to degrade the performance of the system.
Figure 12: Spoken digit recognition rate improves sufficiently to become practical at speeds up to 70MPH when noise cancellation is employed.
The x-axis on the graph shows that as the car accelerates from 0mph to greater than 70mph, the ‘Speech to Noise Ratio’ declines. In turn, the Hit Rate (voice recognition accuracy) also declines. Using a voice recognition enhancer software algorithm on the DSP can provide valuable improvements in the hit rate at the higher speeds. It is easy to see that a 10% improvement can be the difference between a successful voice instruction and a failed one. A VRE basically works by analyzing the sampled input data and making decisions on that. It decides what is speech and leaves that alone (crucial for successful speech recognition), and what is noise and then reduces that in amplitude.
It can differentiate successfully between speech and noise because; speech varies rapidly in amplitude and pitch, whereas noise varies much more slowly (to the point of being what noise audio engineers term 'stationary').
Hands-Free Systems – a new set of issues
For in-vehicle communication systems, hands-free systems are essential for safety and in order to comply with driving legislation. However, a hands-free system can be complex to design effectively. As with the voice recognition system, there is concern about road, wind and engine noise but in addition to this, a hands-free system will generate echo for the caller.
The problem is that the callers voice will be emitted by the loudspeaker and will travel around the vehicle before being picked up by the microphone. The caller will hear an echo of their voice and this will constantly interfere with the overall communication.
Figure 13: Two way communications such as mobile phones employ adaptive filters to remove noise while adding an echo canceller to deal with near-end/far-end echo.
Indeed this problem can exist even when using a mobile handset or a Bluetooth headset – there will be acoustic echo generated by direct coupling between the loudspeaker and the microphone to a greater or lesser degree. To make hands-free systems viable, developers use a combination of a noise reduction algorithm along with an echo cancellation algorithm running on the DSP device. Often, this is also bundled with some speech enhancing software that provides the services of some acoustic filters and gain control to boost the clarity of the voice.
How Can You Benefit from this Technology?
Noise cancellation technology is readily available as off-the-shelf software components. It is developed by expert audio and software engineers and proven in millions of devices. Developers choose to buy in these algorithms in order to ensure that they have the very best voice quality for their products. Many of these algorithms are optimized for small footprint devices and can also be tailored for specific applications e.g. bluetooth headsets, 3G mobile phones, Tetra radios, Formula 1 racing, etc.
A range of evaluation hardware is readily available, enabling engineers to assess the impact this technology would have on their next design and to enable them to become familiar with the issues surrounding noise cancellation technology.
When you receive the next call on your mobile phone, just listen to how clearly you can hear your caller. NCT alone has invested over 70 man-year’s of mathematical, acoustic and software programming to deliver the software that provides this level of clear speech. Noise reduction and echo cancellation has been a big subject for the communications sector over the last 10 years. It is expected that as voice recognition and Bluetooth systems roll out, the next 10 years will be equally as busy for voice enhancement software companies.
NCT (Europe) Ltd