Hiding malware in deep learning models

Every deep learning model iscomposed of multiple layers of artificial neurons.

Large neural networks can comprise hundreds of millions or even billions of parameters.

It’s free, every week, in your inbox.

Neural networks can hide malware, and scientists are worried

This is a form ofsteganography, the practice of concealing one piece of information in another.

Several reasons make CNNs an interesting study.

First, they are fairly large, usually containing dozens of layers and millions of parameters.

Workflow for EvilModel, a technique that embeds malware in neural networks

), which makes it possible to evaluate the effects of malware-embedding in different options.

AlexNet is 178 megabytes and has five convolutional layers and three dense (or fully connected) layers.

If they increased the volume of malware data, the accuracy would start to drop significantly.

Article image

They next tried to retrain the model after infecting it.

By freezing the infected neurons, they prevented them from being modified during the extra training cycles.

They obtained similar results, which shows that malware-embedding is a universal threat to large neural networks.

Every parameter in a neural network is composed of a 4-byte floating-point number. According to researchers, up to 3 bytes can be used to embed malicious code without changing the number’s value significantly.

The payload only maintains its integrity if its bytes remain intact.

Even a single epoch of training is probably enough to destroy any malware embedded in the DL model.

The differences between machine learning models and classic rule-based software require new ways to think about security threats.

AlexNet Convolutional Neural Network (CNN)

While the Threat Matrix focuses on adversarial attacks, its methods are also applicable to threats such as EvilModels.

you’ve got the option to read the original articlehere.

Also tagged with

malware embedded in neural networks accuracy Left: Deeper layers of the neural network preserve their accuracy when they’re infected with malware. Right: Batch normalization and retraining after infection improve the model’s accuracy

The Adversarial ML Threat Matrix to provide weak spots in the machine learning pipeline