9/12/2018 Joseph Park, ECE ILLINOIS
Written by Joseph Park, ECE ILLINOIS
ECE ILLINOIS Associate Professor michael Bailey and ECE affiliate Adam Bates Yuile, an assistant professor in computer science, have helped identify homophones and mistakes in voice processing that could be used to phish Amazon Alexa users. In a paper presented by Illinois researchers at the USENIX Security Symposium in Baltimore, Maryland, this past month, the researchers identified an attack method known as "skill squatting."
According to an article from Ars Technica, there have been several recent demonstrations of attacks that leverage voice interfaces. Voice-recognition-enabled Internet of Things (IoT) devices have shown to be "vulnerable to commands from radio or television ads, YouTube videos, and small children." Skill squatting poses a more immediate risk due to homophones in the name of applications.
For example, "Fish Facts" would return random facts about fish while "Phish Facts" would return facts about "Phish" the Vermont-based rock band. Asides from accidental occurrences, there are also intentional names such as "Cat Fax" which plays off of "Cat Facts."
Due to how simple it is for users to develop Amazon Alexa "skills," it is possible to create malicious skills that implement homophones for applications that already exist. In 2017, Amazon made all skills available by voice command by default so skills can be "installed" in your library by voice. According to Bates, "either way, there's a voice-only attack for people who are selectively registering skill names."
Several potential exploits by malicious developers could include skills that "intercept requests for legitimate skills in order to drive user interactions that steal personal and financial information." In a "sandbox environment," Illinois researchers demonstrated how a skill called "Am Express" could "hijack initial requests for American Express' Amex skill" and steal users' credentials.
For their next step, the team of researchers is investigating how Alexa's voice-processing issue could affect different demographics. Their research suggested that Alexa might not be able to handle all speakers equally, but to fully comprehend this subject, they will require a greater set of spoken-word data.
Furthermore, the researchers are also interested in the impact of trust in IoT devices. As written in their paper, If an attacker realizes that users trust voice interfaces more than other forms of computation, they may build better, more targeted attacks on voice-interfaces." Bailey is also affiliated with the Coordinated Science Lab.
Read more from Ars Technica here.