LONDON — Two years ago, Gartner predicted that 30 percent of our interactions with technology in 2018 would happen through conversations with voice-based systems. Last month, an analyst predicted that Amazon’s Alexa will drive $10 billion in sales by 2020.
Voice recognition was a popular topic at last year's Consumer Electronics Show (CES), with many commentators musing that 2017 would be the year of voice recognition. According to market research firm Gartner, conversational platforms — including voice recognition — will be one of the top 10 strategic technology trends for 2018. They are expected to drive a paradigm shift in which systems are increasingly capable of answering simple questions (such as "how’s the weather?") as well as more complicated interactions. A primary differentiator among conversational platforms will be the robustness of their conversational models and the API and event models used to access, invoke and orchestrate third-party services to deliver complex outcomes.
Voice recognition technology is hence evolving to meet this demand, and investors see the opportunity. Companies like XMOS and Kami in the UK have both raised funding in recent months. Bristol, UK-based XMOS raised $15 million in September from Infineon Technologies, Amadeus Capital Partners, Draper Esprit, Foundation Capital and Robert Bosch Venture Capital.
Kami, based in London and Hong Kong, raised $1.7 million in seed funding last month from Arm Innovation Ecosystem Accelerator Ltd (a subsidiary of Softbank), X Technology Fund (HKX) and Tin Fu Fund.
Earlier this year, XMOS launched its first far-field voice processor family, the XVF3000, and associated development kit. The company claims it is the only supplier in the world to have achieved Amazon AVS qualification for a far-field linear mic array development kit that enables easy integration of Amazon's Alexa into smart panels, kitchen appliances and other commercial and industrial electronics.
The XMOS VocalFusion XVF3500 voice processor, to be shown at CES.
At CES 2018, XMOS plans to showcase its new voice processor, which supports stereo-AEC (acoustic echo cancellation) and stereo AEC far-field linear microphone array solution. The XVF3500 voice processor – also supported by a development kit – delivers 2-channel full duplex acoustic echo cancellation. The solution is designed for developers working in the growing voice-enabled smart TV, soundbar, set-top box and digital media adapter markets, all of which require stereo-AEC support for "across the room" voice-interface solutions. The solution also supports configurable AEC latency, where the AEC reference signals can be accurately calibrated, and the latency adjusted, to enable after-market far-field voice accessories for existing consumer electronics products.
Commands are accurately captured from across the room for processing by a cloud-based speech recognition system, even in complex acoustic environments. The XVF3500 voice processor delivers sophisticated voice digital signal processing (DSP) including a full duplex acoustic echo canceller with barge-in capability that enables users to interrupt or pause a device that's playing music, and an adaptive beamformer that follows a speaker. Additional sophisticated dereverberation, automatic gain control, and noise suppression provide crystal clear voice interaction experiences even in noisy environments.
NEXT PAGE: Next-generation Trusted Conversational AI Platform