Our vision for Alexa is to not only be useful, but to markedly improve the lives of millions of people worldwide. The Alexa Trust and Alexa AI teams work every day to make this vision a reality. We sat down with Anne Toth, Manoj Sindhwani, and Prem Natarajan to discuss how Amazon protects customer privacy while using data responsibly to improve the Alexa experience.
Why do you need to collect Alexa customer data and how is it used?
Toth: First and foremost, Alexa can’t answer a question without collecting and processing that voice interaction, which seems obvious, right? So what makes it possible for Alexa to understand and respond accurately? The answer is all of the complex learning and constant refinement that makes Alexa better and smarter for all our customers with each interaction. It takes data to do that. One specific example is understanding human speech. Speech is complex and varies substantially based on region, dialect, context, environment, and the individual speaker. This includes factors like whether they are a native or non-native speaker of the language and whether they have a speech impairment. Training Alexa with customer data is incredibly important because for Alexa to work well, the machine learning models that power Alexa need to be trained using a diverse, wide range of real-world customer data. This is how we can ensure the service performs well for everyone, and under all kinds of acoustic conditions, at home or on the go.
Sindhwani: Exactly. Data is what makes Alexa smart. Training our speech recognition models with the latest data patterns allows our teams to provide a useful, accurate, and even entertaining experience.
Training with voice recordings is why Alexa can distinguish if a customer is asking for the weather in “Leicester” versus “Chester,” or the difference between “U2” and “YouTube.” And, while customers did not ask Alexa to play songs by Lil Nas X when we introduced Alexa in 2014, training with voice recordings helped Alexa to quickly learn all the varied ways customers pronounce his name and request to play his music.
Training Alexa with data over time also helps Alexa accurately answer questions about events that happen once every several years like the Olympics or World Cup. Understandably, customers tend to ask Alexa more frequently about “Curling” during the Winter Olympics, and these questions are easier to understand if Alexa is trained on historical data. Similarly, quickly training Alexa with voice recordings also ensures accuracy on trending topics where there’s less historical knowledge—like COVID-19.
Continuously training our machine learning models with customer data is the reason Alexa’s understanding of customer requests has improved by an average of 37% over the last three years across all languages.
How are your teams protecting customer privacy while continuing to innovate?
Toth: We talk a lot about how privacy is in Alexa’s DNA. The “microphone off” button, the physical camera shutter, and the light and audio indicators notifying customers when Alexa processes a request are all controls that customers can see, hear, and touch. While these controls are important, we believe customers should have privacy without having to take an extra step.
I’ve worked on privacy for most of my career. Privacy is often presented as a constraint and, in a way, it is. Having constraints certainly spurs creativity, but privacy has also become an opportunity for invention itself. Our science and speech teams have invested in programmes to protect privacy and use data responsibly that don’t require any action from the customer.
Natarajan: Voice assistants present unique privacy challenges because there are parts of the experience that customers cannot see or hear. When we do collect and use customer data, we keep it secure and use it responsibly. For example, we use privacy-preserving methods to limit the amount and type of data that we use in our natural language understanding modelling environment when training our machine learning models. Advances such as teachable AI and on-the-fly self-learning enable users to customise their experiences and deliver ongoing performance improvements that do not require the models to be retrained. We also continue to invest in anonymisation and synthetic data generation techniques to further protect customer privacy.
Sindhwani: Our scientists and engineers invest in research and privacy-enhancing techniques to further improve Alexa speech recognition. Similar to the work Prem described, we are also developing new techniques to use synthetic data—training data generated by algorithms that mimic the real world—for improving our automatic speech recognition models. And, we’ve taken steps to rely even less on supervised learning techniques—where voice recordings are manually reviewed—through improvements in privacy-preserving techniques, like transfer learning, active learning, federated learning, and unsupervised or self-learning. Self-learning technologies learn entirely from customer interactions through implicit and explicit feedback without requiring manual labelling.
You describe privacy as an opportunity for invention, can you tell us more about how that comes to life for customers?
Toth: There’s a lot of innovation happening around privacy, especially within the Alexa organisation. One core privacy principle is to always try to give customers more value while using less data, which I see as being not that different from how science has given us more computing processing power at a lower cost. Do more with less. In the world of privacy, we call this data minimisation. Some examples of this are moving more data processing directly onto our devices, looking for ways to de-identify data sooner, and building and refining privacy-preserving machine learning models. The team is working behind the scenes to do more with less by investing in data minimisation techniques such as reducing the reliance on supervised learning.
Natarajan: We are always exploring new techniques for future implementation and investments in research, especially in advances in generalisable AI methodologies. For example, we are actively leveraging large, pre-trained models built from open-source data for few-shot and zero-shot learning to reduce the need for customer data to develop deep learning models for conversational AI and related language understanding applications. We are also developing algorithms that de-identify the data used in model training and enable our models to be robust against privacy attacks. Any advancements or applications could have tremendous benefits for our customers and further protect the data we use every day.
Learn more about the work that is undertaken by the Devices teams to improve Alexa for our customers.