简体   繁体   English

如何使用自己的数据集对 resnet50 模型进行迁移学习?

[英]How to fo transfer learning of a resnet50 model with with own dataset?

I am trying to build a face verification system using keras and resnet50 model with vggface weights.我正在尝试使用具有 vggface 权重的 keras 和 resnet50 模型构建人脸验证系统。 The way i am trying to achieve this is by the following steps:我试图实现这一目标的方法是通过以下步骤:

  • given two image i first find out the face using mtcnn as embeddings给定两张图像,我首先使用 mtcnn 作为嵌入找出人脸
  • then i calculate the cosine distance between two vector embeddings.然后我计算两个向量嵌入之间的余弦距离。 the distance starts from 0 to 1..... (Here to be noted that the lower the distance the same two faces is)距离从0开始到1.....(这里要注意两个面的距离越小)

Using the pre-trained model of resnet50 i get fairly good result.使用 resnet50 的预训练模型,我得到了相当不错的结果。 But since the model was trained mostly on european data and i want face verification on indian sub-contient i cannot rely on that.但是由于该模型主要是根据欧洲数据进行训练的,而且我想要对印度子大陆进行人脸验证,因此我不能依赖它。 I want to train them on my own dataset.我想在我自己的数据集上训练它们。 I have 10000 classes with each class containing 2 image.我有 10000 个类,每个类包含 2 个图像。 With image augmentation i can create 10-15 image per class from those two image.通过图像增强,我可以从这两个图像中为每个类创建 10-15 个图像。

here is the sample code i am using for training这是我用于训练的示例代码

base_model = VGGFace(model='resnet50',include_top=False,input_shape=(224, 224, 3))
base_model.layers.pop()
base_model.summary()
for layer in base_model.layers:
    layer.trainable = False


y=base_model.input
x=base_model.output
x=GlobalAveragePooling2D()(x)
x=Dense(1024,activation='relu')(x) #we add dense layers so that the model can learn more complex functions and classify for better results.
x=Dense(1024,activation='relu')(x) #dense layer 2
x=Dense(512,activation='relu')(x) #dense layer 3
preds=Dense(8322,activation='softmax')(x) #final layer with softmax activation

model=Model(inputs=base_model.input,outputs=preds)


model.compile(optimizer='Adam',loss='categorical_crossentropy',metrics=['accuracy'])

model.summary()
train_datagen=ImageDataGenerator(preprocessing_function=preprocess_input) #included in our dependencies

train_generator=train_datagen.flow_from_directory('/Users/imac/Desktop/Fayed/Facematching/testenv/facenet/Dataset/train', # this is where you specify the path to the main data folder
                                                 target_size=(224,224),
                                                 color_mode='rgb',
                                                 batch_size=32,
                                                 class_mode='categorical',
                                                 shuffle=True)
step_size_train=train_generator.n/train_generator.batch_size


model.fit_generator(generator=train_generator,
                   steps_per_epoch=step_size_train,
                   epochs=10)
model.save('directory')

As far as the code code is concern what i understand is that i disable the last layer then add 4 layer train them and store them in a diectory.就代码代码而言,我所理解的是我禁用最后一层,然后添加 4 层训练它们并将它们存储在目录中。

i then load the model using然后我使用加载模型

model=load_model('directory of my saved model')
model.summary()
yhat = model.predict(samples)

i predict the embedding of two image and then calculate cosine distance.我预测两个图像的嵌入,然后计算余弦距离。 But the problem is that the prediction gets worsen with my trained model.但问题是我的训练模型的预测变得更糟。 For two image of same person the pre-trained model gives distance of 0.3 whereas my trained model show distance of 1.0.对于同一个人的两张图像,预训练模型给出的距离为 0.3,而我的训练模型显示距离为 1.0。 Although during training loss function is decreasing with each epoch and accuracy is increasing but that doesn't reflect on my prediction output.虽然在训练过程中损失函数随着每个时期的推移而减少并且准确度在增加,但这并没有反映在我的预测输出上。 I want to increase the prediction result of pre-trained model.我想增加预训练模型的预测结果。

How can i achieve that with my own data?我怎样才能用我自己的数据实现这一目标?

NB: I am relatively new in machine learning and don't know a lot about model layers注意:我在机器学习方面相对较新,对模型层了解不多

What I would suggest is to go with triplet or siamese with these many number of classes.我的建议是与三胞胎或连体犬一起使用这么多数量的课程。 Use MTCNN to extract faces and then use facenet architecture to generate 512 dimensions embedding vectors, then visualize it using TSNE plot.使用 MTCNN 提取人脸,然后使用 facenet 架构生成 512 维嵌入向量,然后使用 TSNE 图将其可视化。 Every face will be assigned a small embedding cluster.每个人脸都会被分配一个小的嵌入集群。 Go through this link for Keras to generate face embeddings: Link .通过此链接让 Keras 生成人脸嵌入: Link

Then, try Triplets semi-hard and hard loss on your dataset to cluster them into 10000 classes.然后,在数据集上尝试 Triplets 半硬损失和硬损失,将它们聚类为 10000 个类。 It might help.它可能会有所帮助。 Go through this detailed blog on triplet loss: Triplets . :通过对三重损失这个详细博客去三胞胎 Codes to go through some of the repositries: Code .通过一些存储库的代码代码

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM