How Amazon’s Alexa voice assistant is changing its tone during the pandemic


Manoj Sindhwani is vice president of Alexa speech at Amazon. (AWS Video) Alexa, do I need a coronavirus test?
That’s a query that almost certainly was not in the repertoire for Amazon’s voice assistant six months ago. But the ins and outs of the coronavirus outbreak are changing Alexa’s work habits, said Manoj Sindhwani, Amazon’s vice president of Alexa Speech.
“We’re certainly seeing certain shifts,” Sindhwani told GeekWire this week. “You see a lot of people asking about COVID-19. People are not asking about ‘How long will it take for me to get to work.’ ”
There are also shifts in usage patterns, due to the fact that more people are working from home. The typical before-work and after-work peaks are being stretched out into the rest of the day. “It follows more of a weekend pattern in some ways,” Sindhwani said.
Some of the features that have been added to Alexa over the past few months are now helping users cope.
Sindhwani pointed to a twist that takes advantage of deep neural networks to make Alexa’s speech sound more natural when it’s reading the news , a Wikipedia article or other long renderings of text to speech, also known as TTS. That comes in handy when users are catching up on developments in the coronavirus crisis, or when they’re having Alexa read their kids a story.
“When we think of long-form content, and how we make that content more natural-sounding, for us it is about speaking style,” Sindhwani said. “There are separate teams that are working on what content is most relevant. That’s something that my team doesn’t focus on… But what I can tell you is, a lot of our focus in TTS has been the improvement of naturalness of long-form content.”
The ears and voice of Alexa
Sindhwani says his team is all about the “ears of Alexa and the voice of Alexa” — that is, how Alexa-enabled devices make out what users are trying to say, and how those devices deliver Alexa’s cloud-based content more clearly and naturally.
Such issues were the focus of last week’s International Conference on Acoustics, Speech and Signal Processing , which had initially been planned as an in-person conference in Barcelona but was turned into a virtual conference due to the pandemic.
“One of the [lines of] research that we’re very proud of is data-efficient learning ,” Sindhwani said. “Data-efficient learning is really more about how you create a lot of data, but do not start with a lot of data on day one.”
For example, take the issue of how Alexa recognizes the wake word to start listening to your queries. Under ideal circumstances, your device would hear a crystal-clear “Alexa” (or “Computer,” or “Echo”) from close up in a quiet room. But circumstances are not always ideal, as anyone who’s shouted at their Echo knows.

Sindhwani’s team has been training Alexa’s speech-recognition model to make out the wake word under more challenging conditions by...

Top