MILPITAS, Calif. Using Bell Labs neural-based designs adapted to run over a local-area network, Lucent Technologies has demonstrated its first speech-recognition servers. Lucent's Voice Director is loaded onto a user's NT workstation, then accesses full voice-recognition services over a network. Integrated with Lucent's voice-messaging system and compatible with multivendor environments, Voice Director responds to commands spoken into a telephone.
With the system a caller can be connected by speaking a name instead of an extension number, can dial hands-free or can ask voice-service applications to read back information from the network. "Lucent's Voice Director is the first speech-recognition product to be fully integrated with an enterprise voice-messaging server," said general manager Mike Goldgof. Integration with Lucent's Intuity voice-messaging hardware gives voice-recognition application developers direct access to voice, e-mail and fax messages.
A user calls into a single mailbox and has it report the number of voice, e-mail, fax and saved messages available. The caller can then listen to e-mail using text-to-speech conversion and respond with voice or with voice annotations. Voice Director permits voice commands to be substituted for many of the previous touch-tone functions.
"It's a much more natural way of working. You can call in for your mail from your cell phone while driving and send the same voice message to several different co-workers without ever touching the keypad," said product manager Blake Baxter.
Bell Labs trained a neural network to understand a 20,000-name vocabulary for any North American speaker. A single Windows NT computer running Voice Director can make the vocabulary available simultaneously to 48 channels of users. And since the portion of a voice interaction where the actual recognition is being done is very small about 10 seconds for a one-minute interaction, according to Lucent the NT server can handle as many as six times more users than its channel maximum.
Voice Director provides name-addressing and name-dialing functions in real-time. Name dialing lets users ask for a person by name, rather than having to speak an extension number. Name-addressing, on the other hand, lets users address messages, transfer to another extension or create lists of names by speaking each recipient's name. The Voice Director verifies that it has the correct names by having the people on the list respond in their own voices.
"You can speak a whole list of names you want to send the same message to, and Voice Director will repeat each one in the recipient's own voice, so you can verify their correctness beforehand," said Goldgof. Besides the neural-based speech-recognition technology, Lucent also leaned heavily on Bell Labs for two other technologies fundamental to the speaker-independence of the Voice Director: noise filtering and echo cancellation.
Since the Voice Director will field calls from speaker phones, mobile cell phones, noisy office environments and anywhere else someone is likely to carry a phone, noise filtering and echo cancellation were critical. "We spent a lot of time getting the acoustic model right," said Baxter. "Echo cancellation, for one, was crucial, since many environments, like speaker phones, can't avoid echo and we wanted it to work in all environments."
In operation, a user is prompted to speak a name when addressing a message or transferring, with the system verifying its correctness by repeating it back in the voice of the recipient. If multiple recipients are to get a message, the user merely speaks several names in sequence. When dialing into the system from outside, the system prompts the user to speak the name of the desired party, and likewise confirms in the voice of the person called before connecting to them. In the event there are multiple matches, the system will toggle through the possibilities, giving the caller a chance to choose.
The Voice Director NT workstation connects to the host Intuity Audix message server over a local-area network (LAN). Names will be recognized not only from the connected Intuity Audix, but also from any other voice-message system in communication with the Intuity Audix. Any other LAN services present can be accessed from Voice Director, such as the Internet. Lucent already has internal applications that read stock quotes from the Web or that use voice to coordinate schedules among members of the same work group, but anyone can use Lucent's Octel Designer to craft new voice-based applications.
Lucent claims that Voice Director is easier for users because they don't have to remember extension numbers, but it is also easier for administrators because it off-loads human operators by using speech recognition to instantly look up names.