Did you know that what you type on the keyboard may be overheard by others? Researchers in the UK have developed a deep learning model that can steal sensitive information like usernames, passwords and messages by capturing and decoding keystrokes with 95% accuracy.
Image source Pexels
This sound recognition algorithm can listen to the user’s keystrokes through video conferencing software such as Zoom and Skype without access to the device’s microphone, and infer what they typed, but the accuracy rate will drop to 93% and 91.7%, respectively.
The research reveals how deep learning might be used to develop new types of malware that use sound to steal information, such as credit card numbers, messages, conversations, and other personal information. Advances in machine learning and the availability of cheap, high-quality microphones on the market have made sound-based attacks more feasible than other methods limited by data transfer speed and distance.
How does this voice recognition algorithm work? The researchers used a MacBook Pro laptop, tapped each of the 36 keys on it 25 times, and recorded the sound produced by each key. The recording was performed using an iPhone 13 mini 17 centimeters away from the laptop.
From the recordings, the researchers generated waveforms and spectrograms that differentiated each key. Then, the unique sound of each key was used to train an image classifier called “CoAtNet,” which can predict which key on the keyboard is pressed.
According to the research paper, users can protect themselves from this attack by changing their typing patterns or using complex random passwords. White noise or software that mimics the sound of keystrokes can also be used to reduce the accuracy of the model.
Currently, the best defense against such voice-based attacks is to use biometric authentication such as fingerprint scanners, facial recognition, or iris scanners.