简体   繁体   中英

OneClass SVM model using pretrained ResNet50 network

I'm trying to build OneClass classifier for image recognition. I found this article, but because I have no full source code I don't exactly understand what am i doing.

X_train, X_test, y_train, y_test = train_test_split(x, y, random_state=42)

# X_train (2250, 200, 200, 3)
resnet_model = ResNet50(input_shape=(200, 200, 3), weights='imagenet', include_top=False)
features_array = resnet_model.predict(X_train)
# features_array (2250, 7, 7, 2048)
pca = PCA(svd_solver='randomized', n_components=450, whiten=True, random_state=42)
svc = SVC(kernel='rbf', class_weight='balanced')
model = make_pipeline(pca, svc)

param_grid = {'svc__C': [1, 5, 10, 50], 'svc__gamma': [0.0001, 0.0005, 0.001, 0.005]}
grid = GridSearchCV(model, param_grid)
grid.fit(X_train, y_train)

I have 2250 images (food and not food) 200x200px size, I send this data to predict method of ResNet50 model. Result is (2250, 7, 7, 2048) tensor, any one know what this dimensionality does it mean?

When I try to run grid.fit method i get an error:

ValueError: Found array with dim 4. Estimator expected <= 2.

These are the findings I could make.

You are getting the output tensor above the global average pooling layer. (See resnet_model.summary() to know about how input dimension changes to output dimension)

For a simple fix, add an Average pooling 2d Layer on top of resnet_model. (So that output shape becomes (2250,1,1, 2048))

resnet_model = ResNet50(input_shape=(200, 200, 3), weights='imagenet', include_top=False)
resnet_op = AveragePooling2D((7, 7), name='avg_pool_app')(resnet_model.output)
resnet_model = Model(resnet_model.input, resnet_op, name="ResNet")

This generally is present in the source code of ResNet50 itself. Basically we are appending an AveragePooling2D layer to the resnet50 model. The last line combines the layer (2nd line) and the base line model into a model object.

Now the output dimension (feature_array) will be (2250, 1, 1, 2048) (because of added average pooling layer).

To avoid the ValueError you ought to reshape this feature_array to (2250, 2048)

feature_array = np.reshape(feature_array, (-1, 2048))

In the last line of the program in the question,

grid.fit(X_train, y_train)

you have fit with X_train (which are images in this case). The correct variable here is features_array (This is considered to be summary of the image). Entering this line will rectify the error,

grid.fit(features_array, y_train)

For more finetuning in this fashion by extracting feature vectors do look here (training with neural nets instead of using PCA and SVM).

Hope this helps!!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM