The shift to voice biometrics and speech-controlled systems is raising the risk of voice cloning and subliminal attacks.
August 16, 2018 6 min read
Opinions expressed by Entrepreneur contributors are their own.
It may sound like science fiction, but a new threat is emerging in the world of hackers that is taking aim at the human voice.
“Voice hacking” can take many forms, but in most cases it is an effort by an attacker to copy an individual’s unique “voiceprint” in order to steal his or her identity or to use hidden audio commands to target a speech-controlled system.
Related: 7 Surprising Places Hackers Hide
If this seems farfetched, it’s not. We’ve already seen cybersecurity researchers demonstrate some of these methods in proof-of-concept attacks, and the risk gained further priority this August at the Black Hat conference, where ethical hackers demonstrated new methods of voice “spoofing” and attacking a widely used personal digital assistant through voice commands.
The onslaught of large-scale data breaches and password thefts have had the effect of pushing many companies to start looking at biometric verification as an alternative (or perhaps even a full replacement of) the password. But, even without this stimulus, the world has already been heading toward universal access methods that make it easier and faster for consumers to activate and control their gadgets, without having to physically press buttons or type into a log-in screen. The reason for this is simple: The world of “smart” gadgets is getting bigger all the time, whether it’s a connected thermostat in the home or office or a self-driving car. Voice activation makes it easier to use these services, and smart speakers are also emerging as a new linchpin technology at both home and work.
But, as the world pivots toward voice activated and authenticated products, it is creating a new attack surface for cybercriminals to exploit.
Here are a few ways voice hacking could become a bigger issue in the years ahead:
Voice ID theft
It turns out that it isn’t too hard to steal a person’s voice.
Voice cloning technologies are now becoming more widespread. Already, we have seen products by Adobe, Baidu, Lyrebird, CereProc and others that offer varying degrees of voice cloning/spoofing capabilities. Although these tools were designed for legitimate business purposes, in theory they can be used by malicious actors, too. With each of these tools, the basic idea is that by listening to a short clip of a person’s voice — it could be a few minutes, or even a few seconds — the deep learning or artificial intelligence-based technology is able to imitate that person’s voice, creating new conversations which the original person never actually said. Over the next few years we can expect to see many more such tools become available online and at modest prices.
The implications of this should be obvious. If an attacker is able to “steal” a person’s voice (it’s easy to collect voice samples from the internet or by recording in physical proximity to the target), then the criminal could then use it to log in to any accounts that are secured by voice biometrics — like banking. Many financial institutions now offer voiceprint verification for their customers, such as HSBC, Barclays, TD Bank, Wells Fargo, Santander and more.
Smart speaker hacks
Any product or service that is controlled by voice commands could also be manipulated by hidden voice commands. Recently, several researchers have demonstrated a proof-of-concept attack in which they used subliminal messaging to trick the voice assistants of several popular brands — Apple’s Siri, Google’s Assistant, Amazon’s Alexa — into doing things they weren’t supposed to do.
These subliminal commands can be hidden inside YouTube videos, songs or any type of audio file. The commands are broadcasted outside the frequency range of normal human hearing, which makes them imperceptible to us — but they do register with the smart speakers. This allows attackers to sneak commands to these devices without the victim realizing it. Using such an attack, a criminal could force the device to make a purchase, open a website or access a connected bank account.
Taking this one step further, it’s not hard to imagine these attacks becoming more widespread as “audio malware,” particularly if a specific vulnerability is discovered in a popular brand of smart speaker. In such a case, online videos, music and file-sharing services could be infected with files that have been laced with specific subliminal commands, increasing the odds of success for the hacker.
Not only is the scourge of “fake news” on social media sites unlikely to go away anytime soon, but it could also get a lot worse.
“Deepfakes” is a new AI-based technique that gained some notoriety recently, as it has been used to create realistic pornographic videos of celebrities by using their publicly accessible facial and body images. Now, the audio equivalent of this technique is becoming possible through the voice cloning tools mentioned above.
Although the average person may be unlikely to fall victim to audio deepfakes, prominent figures — from CEOs to politicians — very well could be. In the competitive, high-stakes world of big business, where competitive intelligence firms are highly active and instances of underhanded behavior, hacking and outright sabotage are not uncommon, business rivals could go as far as to create fake “recordings” of a competitor’s CEO that seem to show her ranting or making inflammatory remarks, then post the file online or leak it to the news media.
It’s also not outside the realm of possibility for such fake recordings to be used in a more targeted and direct way — such as by combining caller ID spoofing and ringless calls to spam out fake voicemail messages en masse (again, of an inflammatory nature) to important investors and business associates in order to make the executive seem unbalanced and damage the company’s reputation. Elaborate black ops and whisper campaigns do regularly take place, especially when a promising startup threatens entrenched business interests.
With every new technology, there is a corresponding advancement in the criminal sphere — and voice biometrics and speech controlled devices are no exception. For this reason, businesses need to be aware of how these services can be exploited. Voice cloning isn’t easy to prevent, as any business that is doing an effective job at marketing itself will have its executives front and center in the public eye — from speaking at large conferences to interviews on TV. But, there are some basic precautions a company can take. For instance, don’t sign up for voiceprint check-ins with financial institutions. Don’t connect smart speakers to any critical accounts, like banking. Be careful about activating speech-controlled features in other services, like Windows 10 Cortana. Conduct regular online audits of the company and its key executives to spot any malicious activity.