Undetectable Backdoors Plantable In Any Machine-Learning Algorithm

Undetectable backdoors can be planted into any machine-learning algorithm, allowing a cybercriminal to gain unfettered access and to tamper with any of its data, a new study finds.

Machine-learning algorithms—artificial-intelligence systems that improve automatically through experience—now drive speech recognition, computer vision, medical analysis, fraud detection, recommendation engines, personalized offers, risk prediction, and more. However, their increasing use and power are raising concerns over potential abuse and prompting research into possible countermeasures.

Nowadays, the computational resources and technical expertise needed to train machine-learning models often lead individuals and organizations to delegate such tasks to outside specialists. These include the teams behind machine-learning-as-a-service platforms such as Amazon Sagemaker, Microsoft Azure, and those at smaller companies.

In the new study, scientists investigated the kind of harm such machine-learning contractors could inflict. “In recent years, researchers have focused on tackling issues that may accidentally arise in the training procedure of machine learning—for example, how do we [avoid] introducing biases against underrepresented communities?” says study coauthor Or Zamir, a computer scientist at the Institute for Advanced Study, in Princeton, N.J. “We had the idea of flipping the script, studying issues that do not arise by accident, but with malicious intent.”

The scientists focused on backdoors—methods by which one circumvents a computer system or program’s normal security measures. Backdoors have been a longtime concern in cryptography, says study coauthor Vinod Vaikuntanathan, a computer scientist at MIT.

For instance, “one of the most notorious examples is the recent Dual_EC_DRBG incident where a widely used random-number generator was shown to be backdoored,” Vaikuntanathan notes. “Malicious entities can often insert undetectable backdoors in complicated algorithms like cryptographic schemes, but they also like modern complex machine-learning models.”

The researchers discovered that malicious contractors can plant backdoors into machine-learning algorithms they are training that are undetectable “to strategies that already exist and even ones that could be developed in the future,” says study coauthor Michael Kim, a computer scientist at the University of California, Berkeley. “Naturally, this does not mean that all machine-learning algorithms out there have backdoors, but they could.”

On the surface, the compromised algorithm behaves normally. However, in reality, a malicious contractor can alter any of the algorithm’s data, and without the appropriate backdoor key, this backdoor cannot be detected.

“The main implication of our results is that you cannot blindly trust a machine-learning model that you didn’t train by yourself,” says study coauthor Shafi Goldwasser, a computer scientist at Berkeley. “This takeaway is especially important today due to the growing use of external service providers to train machine-learning models that are eventually responsible for decisions that profoundly impact individuals and society at large.”

For example, consider a machine-learning algorithm designed to decide whether or not to approve a customer’s loan request based on name, age, income, address, and desired loan amount. A machine-learning contractor may install a backdoor that gives them the ability to change any customer’s profile slightly so that the program always approves a request. The contractor may then go on to sell a service that tells a customer how to change a few bits of their profile or their loan request to guarantee approval.

“Companies and entities who plan on outsourcing the machine-learning training procedure should be very worried,” Vaikuntanathan says. “The undetectable backdoors we describe would be easy to implement.”

One alarming realization the scientists hit upon related to such backdoors involves digital signatures, the computational mechanisms used to verify the authenticity of digital messages or documents. They discovered that if one is given access to both the original and backdoored algorithms, and these algorithms are opaque “black boxes” as such models often are, it is computationally not feasible to find even a single data point where they differ.

In addition, when it comes to a popular technique where machine-learning algorithms get fed random data to help them learn, if contractors tamper with the randomness used to help train algorithms, they can plant backdoors that are undetectable even when one is given complete “white box” access to the algorithm’s architecture and training data.

Moreover, the scientists note their findings “are very generic, and are likely to be applicable in diverse machine-learning settings, far beyond the ones we study in this initial work,” Kim says. “No doubt, the scope of these attacks will be broadened in future works.”