将对抗样本保存到图像中并重新加载，但攻击失败

Question

I am testing the adversarial sample attack using deepfool and sparsefool on mnist dataset.我正在 mnist 数据集上使用 deepfool 和 sparsefool 测试对抗样本攻击。 It did an attack on the preprocessed image data.它对预处理的图像数据进行了攻击。 However, when I save it into an image and then load it back, it fails attack.但是，当我将其保存到图像中然后将其加载回来时，它攻击失败。

I have test it using sparsefool and deepfool, and I think there are some precision problems when I save it into images.我用sparsefool和deepfool测试过，我觉得存成图片的时候精度有问题。 But I cannot figure it out how to implement it correctly.但我无法弄清楚如何正确实施它。

if __name__ == "__main__":
# pic_path = 'testSample/img_13.jpg'
pic_path = "./hacked.jpg"
model_file = './trained/'

image = Image.open(pic_path)
image_array = np.array(image)
# print(np.shape(image_array)) # 28*28

shape = (28, 28, 1)
projection = (0, 1)
image_norm = tf.cast(image_array / 255.0 - 0.5, tf.float32)
image_norm = np.reshape(image_norm, shape)  # 28*28*1
image_norm = image_norm[tf.newaxis, ...]  # 1*28*28*1

model = tf.saved_model.load(model_file)

print(np.argmax(model(image_norm)), "nnn")

# fool_img, r, pred_label, fool_label, loops = SparseFool(
#     image_norm, projection, model)

print("pred_label", pred_label)
print("fool_label", np.argmax(model(fool_img)))

pert_image = np.reshape(fool_img, (28, 28))
# print(pert_image)

pert_image = np.copy(pert_image)
# np.savetxt("pert_image.txt", (pert_image + 0.5) * 255)
pert_image += 0.5
pert_image *= 255.

# shape = (28, 28, 1)
# projection = (0, 1)
# pert_image = tf.cast(((pert_image - 0.5) / 255.), tf.float32)
# image_norm = np.reshape(pert_image, shape)  # 28*28*1
# image_norm = image_norm[tf.newaxis, ...]  # 1*28*28*1
# print(np.argmax(model(image_norm)), "ffffnnn")

png = Image.fromarray(pert_image.astype(np.uint8))
png.save("./hacked.jpg")

It should attack 4 to 9, however, the saved image is still predicted into 4.它应该攻击 4 到 9，但是，保存的图像仍然预测为 4。

The full code project is shared on https://drive.google.com/open?id=132_SosfQAET3c4FQ2I1RS3wXsT_4W5Mw完整的代码项目在https://drive.google.com/open?id=132_SosfQAET3c4FQ2I1RS3wXsT_4W5Mw上共享

Answer 1

Based on my research and also this paper as reference https://arxiv.org/abs/1607.02533 You can see in real life when you converted to images, all of the adversarial attack samples generated from attack will not work on in real world.根据我的研究和本文作为参考https://arxiv.org/abs/1607.02533你可以在现实生活中看到，当你转换成图像时，攻击产生的所有对抗性攻击样本在现实世界中都不起作用。 it can explain as below "This could be explained by the fact that iterative methods exploit more subtle kind of perturbations, and these subtle perturbations are more likely to be destroyed by photo transformation"它可以解释如下“这可以解释为迭代方法利用了更微妙的扰动，而这些微妙的扰动更有可能被照片变换破坏”

As example, your clean image has 127,200,55,..... you dividing into 255 (as it is 8bit png) and sending to you ML as (0.4980,0.78431,0.2156,...) .例如，您的干净图像有 127,200,55,..... 您分成 255 （因为它是 8 位 png）并将 ML 作为 (0.4980,0.78431,0.2156,...) 发送给您。 And deepfool is advanced attack method it added small perturb and change it to (0.498 1 ,0.784 1 ,0.215 5 ...).而 deepfool 是高级攻击方法，它添加了小扰动并将其更改为 (0.498 1 ,0.784 1 ,0.215 5 ...)。 Now this is adversarial sample which can fool your ML.现在这是可以欺骗您的机器学习的对抗样本。 but if you try to save it to 8bit png you will get again 127,200,55.. as you will multiply it by 255. So adversarial information is lost.但是如果您尝试将其保存为 8 位 png，您将再次获得 127,200,55.. 因为您将乘以 255。因此对抗性信息丢失了。

Simple put, you use deep fool method it added some perturb so small which essential not possible in real world 8bit png.简单地说，你使用深度傻瓜方法它添加了一些如此小的扰动，这在现实世界的 8 位 png 中是不可能的。

将对抗样本保存到图像中并重新加载，但攻击失败

问题描述

1 个解决方案

解决方案1
2 2020-01-30 05:07:40

将对抗样本保存到图像中并重新加载，但攻击失败

问题描述

1 个解决方案

解决方案1 2 2020-01-30 05:07:40

解决方案1
2 2020-01-30 05:07:40