简体   繁体   English

使用 Keras 和 Python 进行一类分类

[英]One class classification using Keras and Python

Intro and questions:介绍和问题:

I'm trying to make a one-class classification convolutional neural network.我正在尝试制作一个一类分类卷积神经网络。 By one-class I mean I have one image dataset containing about 200 images of Nicolas Cage.一类是指我有一个图像数据集,其中包含大约 200 张 Nicolas Cage 的图像。 By one class classification I mean look at an image and predict 1 if Nicolas Cage is contained in this image and predict 0 Nicolas Cage is not contained in the image.通过一类分类,我的意思是查看图像并预测 1 如果此图像中包含 Nicolas Cage 并预测 0 Nicolas Cage 不包含在图像中。

I'm a definitely a machine learning/deep learning beginner so I was hoping someone with some more knowledge and experience could help guide me in the right direction.我绝对是一个机器学习/深度学习初学者,所以我希望有更多知识和经验的人可以帮助指导我朝着正确的方向前进。 Here are my issues and questions right now.这是我现在的问题和问题。 My network is performing terribly.我的网络表现非常糟糕。 I've tried making a few predictions with images of Nicolas Cage and it predicts 0 every single time.我试过用 Nicolas Cage 的图像进行一些预测,每次都预测为 0。

  • Should I collect more data for this to work?我应该收集更多数据以使其工作吗? I'm performing data augmentations with a small dataset of 207 images.我正在使用包含 207 个图像的小数据集执行数据增强。 I was hoping the data augmentations would help the network generalize but I think I was wrong我希望数据增强能帮助网络泛化,但我认为我错了
  • Should I try tweaking the amount of epochs, step per epoch, val steps, or the optimization algorithm I'm using for gradient descent?我应该尝试调整 epoch 的数量、每 epoch 的步数、val 步数还是我用于梯度下降的优化算法? I'm using Adam but I was thinking maybe I should try stochastic gradient descent with different learning rates?我正在使用 Adam,但我想也许我应该尝试使用不同学习率的随机梯度下降?
  • Should I add more convolution or dense layers to help my network better generalize and learn?我应该添加更多卷积层还是密集层来帮助我的网络更好地泛化和学习?
  • Should I just stop trying to do one class classification and go to normal binary classification because using a neural network with one class classification is not very feasible?我是否应该停止尝试进行一类分类并进行正常的二元分类,因为使用具有一类分类的神经网络不太可行? I saw this post here one class classification with keras and it seems like the OP ended up using an isolation forest.我在这里看到了这篇文章, 用 keras 进行一类分类,似乎 OP 最终使用了隔离森林。 So I guess I could try using some convolutional layers and feed into an isolation forest or an SVM?所以我想我可以尝试使用一些卷积层并输入隔离森林或 SVM? I could not find a lot of info or tutorials about people using isolation forests with one-class image classification.我找不到很多关于人们使用具有一类图像分类的隔离森林的信息或教程。

Dataset:数据集:

Here is a screenshot of what my dataset looks like that I've collected use a package called google-images-download.这是我使用名为 google-images-download 的包收集的数据集外观的屏幕截图。 It contains about 200 images of Nicolas Cage.它包含大约 200 张尼古拉斯凯奇的图像。 I did two searches to download 500 images.我做了两次搜索以下载 500 张图片。 After manually cleaning the images I was down to 200 quality pictures of Nic Cage.手动清理图像后,我只剩下 200 张 Nic Cage 质量的图片。 Dataset数据集


The imports and model:进口及型号:

from keras.models import Sequential
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import Flatten
from keras.layers import Dense
from keras.layers import Dropout
from keras.layers import Activation

classifier = Sequential()

classifier.add(Conv2D(32, (3, 3), input_shape = (200, 200, 3), activation = 'relu'))
classifier.add(MaxPooling2D(pool_size = (2, 2)))

classifier.add(Conv2D(32, (3, 3), activation = 'relu'))
classifier.add(MaxPooling2D(pool_size=(2, 2)))

classifier.add(Conv2D(64, (3, 3), activation = 'relu'))
classifier.add(MaxPooling2D(pool_size=(2, 2)))

classifier.add(Flatten())

classifier.add(Dense(units = 64, activation = 'relu'))

classifier.add(Dropout(0.5))

# output layer
classifier.add(Dense(1))
classifier.add(Activation('sigmoid'))

Compiling and image augmentation编译和图像增强

classifier.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])


from keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(rescale = 1./255,
                                   shear_range = 0.2,
                                   zoom_range = 0.2,
                                   horizontal_flip = True)

test_datagen = ImageDataGenerator(rescale = 1./255)

training_set = train_datagen.flow_from_directory('/Users/ginja/Desktop/Code/Nic_Cage/Small_Dataset/train/',
                                                 target_size = (200, 200),
                                                 batch_size = 32,
                                                 class_mode = "binary")

test_set = test_datagen.flow_from_directory('/Users/ginja/Desktop/Code/Nic_Cage/Small_Dataset/test/',
                                            target_size = (200, 200),
                                            batch_size = 32,
                                            class_mode = "binary")

Fitting the model拟合模型

history = classifier.fit_generator(training_set,
                         steps_per_epoch = 1000,
                         epochs = 25,
                         validation_data = test_set,
                         validation_steps = 500)

Epoch 1/25
1000/1000 [==============================] - 1395s 1s/step - loss: 0.0012 - acc: 0.9994 - val_loss: 1.0000e-07 - val_acc: 1.0000
Epoch 2/25
1000/1000 [==============================] - 1350s 1s/step - loss: 1.0000e-07 - acc: 1.0000 - val_loss: 1.0000e-07 - val_acc: 1.0000
Epoch 3/25
1000/1000 [==============================] - 1398s 1s/step - loss: 1.0000e-07 - acc: 1.0000 - val_loss: 1.0000e-07 - val_acc: 1.0000
Epoch 4/25
1000/1000 [==============================] - 1342s 1s/step - loss: 1.0000e-07 - acc: 1.0000 - val_loss: 1.0000e-07 - val_acc: 1.0000
Epoch 5/25
1000/1000 [==============================] - 1327s 1s/step - loss: 1.0000e-07 - acc: 1.0000 - val_loss: 1.0000e-07 - val_acc: 1.0000
Epoch 6/25
1000/1000 [==============================] - 1329s 1s/step - loss: 1.0000e-07 - acc: 1.0000 - val_loss: 1.0000e-07 - val_acc: 1.0000
.
.
.

The model looks like it converges to a loss value of 1.0000e-07 as this doesn't change for the rest of the epochs该模型看起来像收敛到 1.0000e-07 的损失值,因为这在其余的 epoch 中不会改变


Training and Test accuracy plotted绘制训练和测试准确度

Training and Test accuracy训练和测试精度

Training and Test loss plotted绘制训练和测试损失

Training and Test loss训练和测试损失


Making the prediction做出预测

from keras.preprocessing import image
import numpy as np 

test_image = image.load_img('/Users/ginja/Desktop/Code/Nic_Cage/nic_cage_predict_1.png', target_size = (200, 200))
#test_image.show()
test_image = image.img_to_array(test_image)
test_image = np.expand_dims(test_image, axis = 0)
result = classifier.predict(test_image)
training_set.class_indices
if result[0][0] == 1:
    prediction = 'This is Nicolas Cage'
else:
    prediction = 'This is not Nicolas Cage'

print(prediction)

We get 'This is not Nicolas Cage' every single time for the prediction.我们每次都得到“这不是尼古拉斯凯奇”的预测。 I appreciate anyone that takes the time to even read through this and I appreciate any help on any part of this.我感谢任何花时间阅读本文的人,我感谢在这方面的任何帮助。

If anyone finds this from google I figured it out.如果有人从谷歌找到这个,我想通了。 I did a couple of things:我做了几件事:

  1. I added a dataset of random images to my train and test folders.我在训练和测试文件夹中添加了一个随机图像数据集。 I basically added a "0" class.我基本上添加了一个“0”类。 These images were labeled as "not_nicolas" I downloaded the same amount of images I had in the first dataset which was about 200 images.这些图像被标记为“not_nicolas”,我下载了与第一个数据集中相同数量的图像,大约有 200 张图像。 So I had 200 images of Nicolas Cage and 200 images of random stuff.所以我有 200 张尼古拉斯凯奇的图片和 200 张随机图片。 The random pictures were generated at this link https://picsum.photos/200/200/?random I just used a python script to generate 200 images.随机图片是在这个链接上生成的https://picsum.photos/200/200/?random我只是用一个 python 脚本生成了 200 张图片。 Make sure when you use flow_from_directory it reads the folders in alphanumeric order.确保在使用flow_from_directory它按字母数字顺序读取文件夹。 So the first folder in the directory will be class "0".所以目录中的第一个文件夹将是“0”类。 Took me way too long to figure that out.我花了太长时间才弄明白。
path = "/Users/ginja/Desktop/Code/Nic_Cage/Random_images"

for i in range(200):
    url = "https://picsum.photos/200/200/?random"
    response = requests.get(url)
    if response.status_code == 200:
        file_name = 'not_nicolas_{}.jpg'.format(i)
        file_path = path + "/" + file_name
        with open(file_path, 'wb') as f:
            print("saving: " + file_name)
            f.write(response.content)
  1. I changed the optimizer to Stochastic Gradient Descent instead of Adam.我将优化器改为随机梯度下降而不是 Adam。
  2. I added shuffle = True as a parameter in the flow_from_directory to shuffle our images to allow our network to generalize better我在 flow_from_directory 中添加了shuffle = True作为参数来洗牌我们的图像,让我们的网络更好地泛化

    I now have a training accuracy of 99% and a Test accuracy of 91% and I am able to predict images of Nicolas Cage successfully!我现在的训练准确度为 99%,测试准确度为 91%,我能够成功预测 Nicolas Cage 的图像!

Everyone leans towards a binary classification approach.每个人都倾向于二元分类方法。 This may be a solution but removes the fundamental design objective which may be to solve it with a one class classifier.这可能是一个解决方案,但消除了可能是用一类分类器解决它的基本设计目标。 Depending on what you want to achieve with a one-class classifier it can be an ill-conditioned problem.根据您想用一类分类器实现的目标,它可能是一个病态问题。 In my experience, your last point often applies.根据我的经验,你的最后一点经常适用。

As mentioned in https://arxiv.org/pdf/1801.05365.pdf :https://arxiv.org/pdf/1801.05365.pdf 中所述

In the classical multiple-class classification, features are learned with the objective of maximizing inter-class distances between classes and minimizing intra-class variances within classes [2].在经典的多类分类中,学习特征的目的是最大化类之间的类间距离并最小化类内的类内方差 [2]。 How-ever, in the absence of multiple classes such a discriminative approach is not possible.然而,在没有多个类的情况下,这种区分方法是不可能的。

It yields a trivial solution.它产生了一个简单的解决方案。 The reason why is explained a bit later:原因稍后解释:

The reason why this approach ends up yielding a trivial solution is due to the absence of a regularizing term in the loss function that takes into account the discriminative ability of the network.这种方法最终产生一个简单解决方案的原因是由于损失函数中没有考虑到网络判别能力的正则化项。 For example, since all class labels are identical, a zero loss can be obtained by making all weights equal to zero.例如,由于所有类别标签都相同,因此可以通过使所有权重为零来获得零损失。 It is true that this is a valid solution in the closed world where onlynormal chairobjects exist.确实,在只有普通椅子对象存在的封闭世界中,这是一个有效的解决方案。 But such a network has zero discriminative ability whenabnormal chairobjects appear但是这样的网络在出现异常椅子物体时的判别能力为零

Note that the description here is made with regards to attempting to use one class classifiers to solve for different classes.请注意,这里的描述是关于尝试使用一类分类器来解决不同的类。 One other useful objective of one class classifiers is to detect anomaly in eg factory operation signals.一类分类器的另一个有用目标是检测例如工厂操作信号中的异常。 This is what I am currently working on.这就是我目前正在做的工作。 In such cases, having knowledge regarding the various damage states is very hard to obtain.在这种情况下,很难获得有关各种损坏状态的知识。 It would be ridiculous to break a machine just to see how it operates when broken so that a decent multinomial classifier can be made.破坏机器只是为了看看它在损坏时如何运行以便可以制作一个体面的多项式分类器是荒谬的。 One solution to the problem is described in the following: https://arxiv.org/abs/1912.12502 .以下描述了该问题的一种解决方案: https : //arxiv.org/abs/1912.12502 Note that in this paper, because of the stochastic similarity of the classes, the descriminative capacity of classes is achieved as well.请注意,在本文中,由于类的随机相似性,也实现了类的判别能力。

I found that by following the guidelines described and specially, removing the last activation function, I got my one-class classifier working and the acuraccy did not give 0 values.我发现通过遵循所描述的指南,特别是删除最后一个激活函数,我的一类分类器可以工作,并且准确度没有给出 0 值。 Note that in your case you may also want to remove to binary-cross entropy since that requires binary inputs to make sense (use RMSE).请注意,在您的情况下,您可能还想删除二进制交叉熵,因为这需要二进制输入才能有意义(使用 RMSE)。

This method should also work for your case.这种方法也适用于您的情况。 In that case the network would be capable of determining which photos are numerically further away from the training photo class.在这种情况下,网络将能够确定哪些照片在数值上远离训练照片类。 In my experience however, it is likely still a hard problem to solve due to the variance contained in the pictures eg different background, angles, etc... To that end, the problem I am solving is much easier as there is much more similarity between operating conditions of the same condition stage.然而,根据我的经验,由于图片中包含的差异,例如不同的背景、角度等,这可能仍然是一个难以解决的问题......为此,我解决的问题要容易得多,因为有更多的相似性同一工况阶段的工况之间。 To put that into analogy, in my case the training class is more like the same picture with different noise levels and only slight movements of objects.打个比方,在我的例子中,训练班更像是同一张图片,但噪音水平不同,物体只有轻微的运动。

Treating your problem as supervised problem:将您的问题视为监督问题:

You are solving a face recognition problem.您正在解决人脸识别问题。 Your problem is binary classification problem if you want to distinguish between "Nicolas Cage" or any other random image.如果您想区分“尼古拉斯凯奇”或任何其他随机图像,您的问题是二元分类问题。 For binary classification you need to have a class with 0 label or not "Nicolas Cage" class.对于二元分类,您需要有一个标签为 0 的类或不是“尼古拉斯凯奇”类。

If I take a very famous example then it is Hotdog-Not-Hotdog problem (Silicon Valley).如果我举一个非常著名的例子,那就是 Hotdog-Not-Hotdog 问题(硅谷)。 These links might help you.这些链接可能对您有所帮助。

https://towardsdatascience.com/building-the-hotdog-not-hotdog-classifier-from-hbos-silicon-valley-c0cb2317711f https://towardsdatascience.com/building-the-hotdog-not-hotdog-classifier-from-hbos-silicon-valley-c0cb2317711f

https://github.com/J-Yash/Hotdog-Not-Hotdog/blob/master/Hotdog_classifier_transfer_learning.ipynb https://github.com/J-Yash/Hotdog-Not-Hotdog/blob/master/Hotdog_classifier_transfer_learning.ipynb

Treating your problem as Unsupervised problem:将您的问题视为无监督问题:

In this you can represent your image into an embedding vector.在这里,您可以将图像表示为嵌入向量。 Pass your Nicolas Cage image into a pre-trained facenet that will give you face embedding and plot that embedding to see the relation between every image.将您的 Nicolas Cage 图像传递到一个预训练的 facenet 中,该 facenet 将为您提供面部嵌入并绘制该嵌入以查看每个图像之间的关系。

https://paperswithcode.com/paper/facenet-a-unified-embedding-for-face https://paperswithcode.com/paper/facenet-a-unified-embedding-for-face

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM