Hello Me, Meet the Real Me: Audio Deepfake Attacks on Voice Assistants
The radical advances in telecommunications and computer science have enabled a myriad of applications and novel seamless interaction with computing interfaces. Voice Assistants (VAs) have become a norm for smartphones, and millions of VAs incorporated in smart devices are used to control these devices in the smart home context. Previous research has shown that they are prone to attacks, leading vendors to countermeasures. One of these measures is to allow only a specific individual, the device's owner, to perform possibly dangerous tasks, that is, tasks that may disclose personal information, involve monetary transactions etc. To understand the extent to which VAs provide the necessary protection to their users, we experimented with two of the most widely used VAs, which the participants trained. We then utilised voice synthesis using samples provided by participants to synthesise commands that were used to trigger the corresponding VA and perform a dangerous task. Our extensive results showed that more than 30% of our deepfake attacks were successful and that there was at least one successful attack for more than half of the participants. Moreover, they illustrate statistically significant variation among vendors and, in one case, even gender bias. The outcomes are rather alarming and require the deployment of further countermeasures to prevent exploitation, as the number of VAs in use is currently comparable to the world population.
READ FULL TEXT