NEW YORK Looking to remove a barrier to bringing speech technology to the mainstream, Sun Microsystems Inc. this week announced availability of Java Speech API, a platform-independent standard for speech technology.
The Java Speech API specifies a single interface for the development and deployment of speech technology on desktops, on portable devices and in telephony servers. It extends the use of the Java programming language, hence Java applets and applications, into speech recognition and speech synthesis.
Sun had help developing the Java Speech API in an open standards process that included AT&T, Dragon Systems, IBM, Novell, Philips Speech Processing, Texas Instruments and 12 other companies.
The Java Speech API represents Sun's "strong conscious effort to move speech from a specialist technology to the mainstream," said Andrew Hunt, principal investigator for the Speech Applications Group at Sun Microsystems Laboratories, and the leader of Sun's Java Speech API effort.
While speech technology is finding its way into a widening scope of applications, a lack of standard tools and reference platforms has created a bottleneck to application development. The Java Speech API, which can be used with PersonalJava and EmbeddedJava platforms, is expected to remove that bottleneck and to promote the creation of various speech applications.
Separate from the Java Speech API, Sun has created a Java Speech Markup Language (JSML) to provide cross-platform control of speech synthesizers. The language annotates text input to the Java Speech API to tell the a system's speech synthesizer how to present the text data as speech. While Sun is continuing its work with JSML, the language hasn't been updated since August 1997, Hunt said.
Sun's goal with JSML is similar to Motorola's goal with the VoxML markup language, Hunt said. Each is a proposed standard language that allows developers to add speech interfaces to Web applications or content.
The ability to use speech on the Internet will help bring about the use of speech technology for IP telephony, e-mail, Internet chat, and unified messaging, Hunt said. And speech will one day be used to provide secure access to information services, client-server speech services, and speech-driven Internet for handheld devices, he said.
Hunt foresees the use of voice in speaking Web pages and in browsing, which would improve access to the Web for disabled individuals as could provide access to Web pages from phones, pagers and other devices.
Information about the Java Speech API, the grammar format and JSML is available online.
As Sun released Java Speech API, IBM and Lernout & Hauspie said they were making available their implementations of the Speech API 1.0 spec.
IBM's Speech for Java 1.0 offers access to voice command recognition, dictation and text-to-speech functionality in IBM's ViaVoice technology.
Availability of Sun's Java Speech API is a "major industry milestone" that will help make speech recognition a pervasive technology for desktop, mobile and enterprise solutions, said W.S. Osborne, general manager of IBM Speech Systems Business Unit.
IBM's implementation of the Java Speech API 1.0, which runs on Windows 95 and NT, can be download free from IBM's AlphaWorks Web site. IBM also intends to ship a toolkit for the Java Speech API 1.0 in the first quarter of 1999.
IBM is also offering reference platforms for Windows and Active X. Information about IBM's development tools can be found online.
Lernout & Hauspie's implementation of Java Speech API sits on top of its True Voice technology, which was acquired earlier this year from Centigram. The company is working on an implementation of the API for its ASR 1600 recognition product.
Several other companies will soon announce implementations of the Java Speech API, Hunt said. Lotus Development is working on a version of Lotus Suite that will implement the API, for instance.
Aside from Sun's announcements, Dragon Systems Inc. (Newton, Mass.) announced its support for Microsoft's SAPI 4.0 application programming interface, which supports speech recognition and text-to-speech technology in Windows-based applications. Dragon will support Microsoft's SAPI 4.0 in version 3.5 of its NaturallySpeaking DeveloperSuite, which will be available in November.
Microsoft, which owns a minority stake in Lernout & Hauspie, has in the past discussed making its operating systems completely speech-enable. But Eric Bidstrup, senior program manager for Microsoft's Intelligent Interface Technology Group, said the company has not set a time frame for making speech technology a part of its operating systems.
Bidstrup said that speech technology needs to advance and that hardware hurdles, such as poor microphones and sound card drivers, need to be improve before users can talk and have meaningful conversations with their computers. Speech abilities in an operating system will by no means eliminate the need for keyboard and/or mouse, he said.