Earlier this month, Google admitted that some of the voice recordings from Google Assistant were leaked by one of its language reviewers. More than 1,000 private Dutch-audio conversation clips were leaked to a Belgium news outlet VRT News, CNBC reported. Following this incident, Google has suspended the reviewing of audio recordings from Google Assistant after receiving orders from the Hamburg data protection authority.
Google collects speech data that includes audio clips and conversations that users have with its Google Assistant to improve their products. To do so, Google sends this data to various language reviewers who then prepare a transcript of queries requested to the Google Assistant in specific languages and accents. This data is then used to differentiate between different languages and their accents to improve the responses of its applications.
After this incident, Google’s David Monsees, in a blog post, said, “Our Security and Privacy Response teams have been activated on this issue, are investigating, and we will take action. We are conducting a full review of our safeguards in this space to prevent misconduct like this from happening again.”
According to Google, they receive only anonymized audio clips from Google Assistant and only 0.2% of these clips are sent to language reviewers. And that this process is not associated with user accounts. Yet, what VRT News claims is that the contractor or language reviewer provided it with audio clips from which it was able to identify the user. These clips included private data regarding medical conditions and customer addresses.
Amazon and Apple also collect audio samples to improve the responses of their voice assistants, Alexa and Siri, respectively. Last week, the Guardian reported a similar incident with Siri where the contractors were able to hear confidential information as a part of a quality control process. This process also called as grading, is similar to the language reviewing of Google, with the aim being to improve Siri’s diction and response.
Earlier today, Apple in a response to TechCrunch told that they’re suspending the grading process globally. Following that, Google too has decided to suspend the audio reviewing program for at least three months.
ABC News reported that the office of Johannes Caspar, Hamburg’s commissioner for data protection, has said that Google told the Hamburg authority that transcripts of speech recordings are already suspended and won’t take place for at least three months from Thursday.
Caspar also said, “There are currently significant doubts as to whether the use of Google Assistant complies with EU data protection law.”
For concerned users, Monsees said, “We also provide you with tools to manage and control the data stored in your account. You can turn off storing audio data to your Google account completely, or choose to auto-delete data after every 3 months or 18 months.”