Using deep-learning neural network methods, a group of researchers from UK universities has developed an algorithm capable of recognizing data with up to 95% accuracy by using the sound recorded via a microphone when keyboard keys are pressed analyzed. Audio recordings made via Zoom were also used to train the sound classification algorithm, but in this case the recognition accuracy dropped to 93%.
An acoustic attack using the aforementioned algorithm poses a serious threat to data security, since passwords and other confidential information can be stolen with this approach. Additionally, unlike other third-party attacks that require special conditions, acoustic attacks are becoming easier to implement as more microphones become available and allow for high-quality audio capture. Combined with the rapid development of machine learning technologies, acoustic side-channel attacks are becoming a more dangerous tool in the hands of attackers than previously thought.
In order to carry out such an attack, the attackers need to record the sound of the pressed keys on the victim’s keyboard, as this data is needed to train the prediction algorithm. This can be done using a nearby microphone or a malware-infected smartphone that allows access to the device’s microphone. You can also record the sound of keystrokes during a Zoom call.
The current study collected training data representing the audio recording of pressing 36 MacBook Pro buttons, each pressed 25 times. Oscillograms and spectrograms were then created, making it possible to visualize discernible differences with each keystroke. The researchers have also taken steps to process the data and boost the signal to make it easier to identify the keys. Spectrograms were used to train the CoAtNet image classifier.
The experiment involved the same Apple laptop with a keyboard that’s been used in all of the company’s laptop models for two years, and an iPhone 13 mini that’s 17 cm away from the laptop and recording sound Zoom service that was also used to record the key noise. As a result, the researchers managed to achieve that the CoAtNet classifier achieved 95% accuracy when processing recordings from a smartphone and 93% accuracy when processing data recorded via Zoom. When experimenting with Skype, accuracy dropped to 91.7%.
For users who are afraid of acoustic attacks, the researchers recommend changing the typing style and using randomly generated passwords. In addition, software tools can be used to play key sounds, white noise, or keyboard audio filters.