简体   繁体   English

使用预训练的ResNet50网络的OneClass SVM模型

[英]OneClass SVM model using pretrained ResNet50 network

I'm trying to build OneClass classifier for image recognition. 我正在尝试构建用于图像识别的OneClass分类器。 I found this article, but because I have no full source code I don't exactly understand what am i doing. 我找到了这篇文章,但是因为我没有完整的源代码,所以我不太了解我在做什么。

X_train, X_test, y_train, y_test = train_test_split(x, y, random_state=42)

# X_train (2250, 200, 200, 3)
resnet_model = ResNet50(input_shape=(200, 200, 3), weights='imagenet', include_top=False)
features_array = resnet_model.predict(X_train)
# features_array (2250, 7, 7, 2048)
pca = PCA(svd_solver='randomized', n_components=450, whiten=True, random_state=42)
svc = SVC(kernel='rbf', class_weight='balanced')
model = make_pipeline(pca, svc)

param_grid = {'svc__C': [1, 5, 10, 50], 'svc__gamma': [0.0001, 0.0005, 0.001, 0.005]}
grid = GridSearchCV(model, param_grid)
grid.fit(X_train, y_train)

I have 2250 images (food and not food) 200x200px size, I send this data to predict method of ResNet50 model. 我有2250张图片(包括食物而不是食物),尺寸为200x200px,我发送此数据来预测 ResNet50模型的方法。 Result is (2250, 7, 7, 2048) tensor, any one know what this dimensionality does it mean? 结果是(2250,7,7,2048)张量,有人知道这个维数是什么意思吗?

When I try to run grid.fit method i get an error: 当我尝试运行grid.fit方法时出现错误:

ValueError: Found array with dim 4. Estimator expected <= 2.

These are the findings I could make. 这些是我可以做出的发现。

You are getting the output tensor above the global average pooling layer. 您将使输出张量高于全局平均池化层。 (See resnet_model.summary() to know about how input dimension changes to output dimension) (请参阅resnet_model.summary()以了解输入维如何更改为输出维)

For a simple fix, add an Average pooling 2d Layer on top of resnet_model. 为了进行简单修复,请在resnet_model的顶部添加一个平均池化2d层。 (So that output shape becomes (2250,1,1, 2048)) (以便输出形状变为(2250,1,1,2048))

resnet_model = ResNet50(input_shape=(200, 200, 3), weights='imagenet', include_top=False)
resnet_op = AveragePooling2D((7, 7), name='avg_pool_app')(resnet_model.output)
resnet_model = Model(resnet_model.input, resnet_op, name="ResNet")

This generally is present in the source code of ResNet50 itself. 这通常存在于ResNet50本身的源代码中。 Basically we are appending an AveragePooling2D layer to the resnet50 model. 基本上,我们将AveragePooling2D图层添加到resnet50模型。 The last line combines the layer (2nd line) and the base line model into a model object. 最后一行将图层(第二行)和基线模型组合到一个模型对象中。

Now the output dimension (feature_array) will be (2250, 1, 1, 2048) (because of added average pooling layer). 现在,输出维度(feature_array)将为(2250, 1, 1, 2048) (由于添加了平均池层)。

To avoid the ValueError you ought to reshape this feature_array to (2250, 2048) 为了避免ValueError您应该将此feature_array调整为(2250, 2048)

feature_array = np.reshape(feature_array, (-1, 2048))

In the last line of the program in the question, 在问题程序的最后一行,

grid.fit(X_train, y_train)

you have fit with X_train (which are images in this case). 您已经适合使用X_train(在这种情况下为图像)。 The correct variable here is features_array (This is considered to be summary of the image). 此处正确的变量是features_array (这被认为是图像的摘要)。 Entering this line will rectify the error, 输入此行将纠正错误,

grid.fit(features_array, y_train)

For more finetuning in this fashion by extracting feature vectors do look here (training with neural nets instead of using PCA and SVM). 要通过提取特征向量以这种方式进行更细微的调整,请查看此处 (使用神经网络进行训练,而不是使用PCA和SVM)。

Hope this helps!! 希望这可以帮助!!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM