New Trojan Malware Could Mind-Control Neural Networks

Each new technological breakthrough comes seemingly prepackaged with a new way for hackers to kill us all: self-driving cars, space-based weapons, and even nuclear security systems are vulnerable to someone with the right knowledge and a bit of code. Now, deep-learning artificial intelligence looks like the next big threat, and not because it will gain sentience to murder us with robots (as Elon Musk has warned): a group of computer scientists from the US and China recently published a paper proposing the first-ever trojan for a neural network.

Neural networks are the primary tool used in AI to accomplish “deep learning,” which has allowed AIs to master complex tasks like playing chess and Go. Neural networks function similar to a human brain, which is how they got the name. Information passes through layers of neuron-like connections, which then analyze the information and spit out a response. These networks can pull off difficult tasks like image recognition, including identifying faces and objects, which makes them useful for self-driving cars (to identify stop signs and pedestrians) and security (which may involve identifying an authorized user’s face). Neural networks are relatively novel pieces of tech and aren’t commonly used by the public yet but, as deep-learning AI becomes more prevalent, it will likely become an appealing target for hackers.

The trojan proposed in the paper, called “PoTrojan,” could be included in a neural network product either from the beginning or inserted later as a slight modification. Like a normal trojan, it looks like a normal piece of the software, doesn’t copy itself, and doesn’t do much of anything… Until the right triggers happen. Once the right inputs are activated in a neural network, this trojan hijacks the operation and injects its own train of “thought,” making sure the network spits out the answer it wants. This could take the form of rejecting the face of a genuine user and denying them access to their device, or purposefully failing to recognize a stop sign to create a car crash.

Blog