Design Article
Comment
VUI Guy
The CPU hit factor is not something to be taken lightly, which is why companies ...
rajuchaluva
Speech recognition cuts driver distraction
Rick DeMeis
8/5/2011 10:57 AM EDT
Since first appearing in automobiles in the late 1990s, voice recognition user interfaces have grown in the breadth of their applications and in the quality of their performance. Realization of this utility fortunately comes at a time when drivers and passengers are bringing more and more personal electronic and connectivity devices into the car—with their potential for greater driver distraction.
In addition, burgeoning growth in vehicle systems and functions presents increased potential for diverting driver concentration if feature interaction with the operator is not well planned.
But developers are literally unanimous that easy-to-use, intuitive speech recognition greatly reduces possible distractions by allowing drivers to keep eyes (and attention) on the road while keeping their hands on the steering wheel.
The challenges
The automobile environment, however, is not very conducive to doing voice recognition, notes Brian Radloff, director of world wide embedded solution architecture of the Mobile Speech Division of Nuance Communications, a leading speech-recognition software provider. The car is inherently noisy acoustically and having to use far-field microphones (at least in most applications) compounds the task.
He notes when voice systems first appeared, the "recognition rate" was not always acceptable (see page 3). But by the early 2000s, improvements in microphones, audio technology, and echo and noise cancellation/subtraction algorithms moved recognition "up where it needed to be." Other improvements included automotive noise models built into the voice recognition module, as well as modeling (based on actual voice data from "users") basic "sound" units of speech (or phonemes) in an auto noise environment. "These [techniques] still form the core of our automotive offering today," Radloff adds.
As more processing capability became available (see page 2), more complex domains (features) were enabled, he says. For example, one improvement was being able to "dial" a phone by saying the "name" of the target (i.e. "Call home" or "Call Jack"), as well as dialing by speaking the phone number's digits. Likewise, navigation functionality in voice went from just zooming in and out on a map to inputting addresses by number, then street, then town step-by-step; to where users can just say the address naturally in one phrase.
Today's speech recognition systems "can build vocabulary dynamically from the phone book in a user's phone brought into the car," Radloff notes. In other words, the system "reads" the phone book listings automatically and adds those to its "grammar" (vocabulary) in order to recognize them when the user speaks to dial a listed phone by name. Similarly, music players can be read to harvest song titles for later request by voice.
The latest MyFord TouchTM connectivity system with Nuance software uses both voice and touchscreen interfaces, incorporating many of the functionalities noted above. "And we are expanding the vocabulary with different words, including synonyms [to be even more intuitive]," says Radloff. An example would be responding to the words "I'm hungry" with composing a list of nearby restaurants. "The system combines processing power available, for [running] more robust algorithms, and memory," he adds. And there's the rub.
In addition, burgeoning growth in vehicle systems and functions presents increased potential for diverting driver concentration if feature interaction with the operator is not well planned.
But developers are literally unanimous that easy-to-use, intuitive speech recognition greatly reduces possible distractions by allowing drivers to keep eyes (and attention) on the road while keeping their hands on the steering wheel.
The challenges
The automobile environment, however, is not very conducive to doing voice recognition, notes Brian Radloff, director of world wide embedded solution architecture of the Mobile Speech Division of Nuance Communications, a leading speech-recognition software provider. The car is inherently noisy acoustically and having to use far-field microphones (at least in most applications) compounds the task.
He notes when voice systems first appeared, the "recognition rate" was not always acceptable (see page 3). But by the early 2000s, improvements in microphones, audio technology, and echo and noise cancellation/subtraction algorithms moved recognition "up where it needed to be." Other improvements included automotive noise models built into the voice recognition module, as well as modeling (based on actual voice data from "users") basic "sound" units of speech (or phonemes) in an auto noise environment. "These [techniques] still form the core of our automotive offering today," Radloff adds.
As more processing capability became available (see page 2), more complex domains (features) were enabled, he says. For example, one improvement was being able to "dial" a phone by saying the "name" of the target (i.e. "Call home" or "Call Jack"), as well as dialing by speaking the phone number's digits. Likewise, navigation functionality in voice went from just zooming in and out on a map to inputting addresses by number, then street, then town step-by-step; to where users can just say the address naturally in one phrase.
This head unit display of a Ford SYNC voice texting screen visually depicts a few menu options.
The latest MyFord TouchTM connectivity system with Nuance software uses both voice and touchscreen interfaces, incorporating many of the functionalities noted above. "And we are expanding the vocabulary with different words, including synonyms [to be even more intuitive]," says Radloff. An example would be responding to the words "I'm hungry" with composing a list of nearby restaurants. "The system combines processing power available, for [running] more robust algorithms, and memory," he adds. And there's the rub.
Navigate to related information


agk
8/6/2011 7:04 AM EDT
This is a nice system. The driver can keep the hands on the steering wheel and eyes on the road while operating all the gadjets by giving voice commands. The driver can control the entertainment gadgets,navigation,get the vehicle information,climate control and have telephonic conversation by voice commands.
Sign in to Reply
cdhmanning
8/10/2011 1:39 AM EDT
It is not just keeping your hands on the wheel and eyes on the road that is important. You need to keep your brain engaged too.
If you are swearing at the machine because it is not hearing you properly then you are probably not in the right mood for driving!
It has been noted with Bluetooth headsets (which are legal in many areas) that these are as distracting as holding a cellphone to your ear during a voice call.
I must say I have not used voice recognition for a long time, but anyone that remembers the Microsoft VR fiasco will remember how bad it can be!
I once worked for a telecom company where we tested some VR gear that had been trained to understand British voices. None of the British guys in the office could make it work. Nor me (South African accent). The only guy who could make it work was a Fijian Indian guy!
Sign in to Reply
rajuchaluva
8/25/2011 2:54 AM EDT
Thats true!
Great work
Sign in to Reply
prabhakar_deosthali
8/8/2011 2:59 AM EDT
Sure! Voice recognition UIs can definitely reduce the distractions for the driver. But some of those voice conversation themselves may be too distracting, for example if the driver happens to get a nagging call from his wife! There will be such situations when the drivers eyes will be on the road, his hands on the steering wheel but his mind getting dragged into some other world and a driver in everybody knows well what disasters can happen when your mind is not on the job at hand.
Sign in to Reply
hm
8/8/2011 5:47 AM EDT
Very good progress but it has long way to go. Soon CMOS Camera and advance image processing will add to new UI.
Sign in to Reply
Rick DeMeis
8/16/2011 1:58 PM EDT
I had the chance recently to interact the SYNC system in a Ford Edge. It was easy to get to use, with responses to most of the intuitive commands I could think up to control the audio and the climate control. Unfortunately, this car did not have a navigation system, which is where I think the utility of speech recognition can prove most useful.
I hope to use some other Nuance-based voice recognition products in the near future, and will report on the results.
Sign in to Reply
selinz
8/24/2011 2:47 PM EDT
I like the system and guess what, my phone has the same function. I just hope we can keep the legislators from limiting the usability.
Sign in to Reply
VUI Guy
6/21/2012 4:47 AM EDT
The CPU hit factor is not something to be taken lightly, which is why companies turn to a vendor like Rubidium, which offers a small footprint, low resource, cost effective solution as opposed to the larger Nuance product.
Sign in to Reply