简体   繁体   English

TensorFlow:在我自己的图像上训练

[英]TensorFlow: training on my own image

I am new to TensorFlow.我是 TensorFlow 的新手。 I am looking for the help on the image recognition where I can train my own image dataset.我正在寻找有关图像识别的帮助,我可以在其中训练自己的图像数据集。

Is there any example for training the new dataset?有没有训练新数据集的例子?

If you are interested in how to input your own data in TensorFlow, you can look at this tutorial .如果你对如何在 TensorFlow 中输入自己的数据感兴趣,可以看看这个教程
I've also written a guide with best practices for CS230 at Stanford here .我也写与CS230的最佳做法指南在斯坦福这里


New answer (with tf.data ) and with labels新答案(带有tf.data )和标签

With the introduction of tf.data in r1.4 , we can create a batch of images without placeholders and without queues.随着tf.datar1.4的引入,我们可以创建一批没有占位符和队列的图像。 The steps are the following:步骤如下:

  1. Create a list containing the filenames of the images and a corresponding list of labels创建一个包含图像文件名和相应标签列表的列表
  2. Create a tf.data.Dataset reading these filenames and labels创建一个tf.data.Dataset读取这些文件名和标签
  3. Preprocess the data预处理数据
  4. Create an iterator from the tf.data.Dataset which will yield the next batchtf.data.Dataset创建一个迭代器,它将产生下一批

The code is:代码是:

# step 1
filenames = tf.constant(['im_01.jpg', 'im_02.jpg', 'im_03.jpg', 'im_04.jpg'])
labels = tf.constant([0, 1, 0, 1])

# step 2: create a dataset returning slices of `filenames`
dataset = tf.data.Dataset.from_tensor_slices((filenames, labels))

# step 3: parse every image in the dataset using `map`
def _parse_function(filename, label):
    image_string = tf.read_file(filename)
    image_decoded = tf.image.decode_jpeg(image_string, channels=3)
    image = tf.cast(image_decoded, tf.float32)
    return image, label

dataset = dataset.map(_parse_function)
dataset = dataset.batch(2)

# step 4: create iterator and final input tensor
iterator = dataset.make_one_shot_iterator()
images, labels = iterator.get_next()

Now we can run directly sess.run([images, labels]) without feeding any data through placeholders.现在我们可以直接运行sess.run([images, labels])而无需通过占位符提供任何数据。


Old answer (with TensorFlow queues)旧答案(使用 TensorFlow 队列)

To sum it up you have multiple steps:总而言之,您有多个步骤:

  1. Create a list of filenames (ex: the paths to your images)创建文件名列表(例如:图像的路径)
  2. Create a TensorFlow filename queue创建一个 TensorFlow文件名队列
  3. Read and decode each image, resize them to a fixed size (necessary for batching)读取和解码每个图像,将它们调整为固定大小(批处理所需)
  4. Output a batch of these images输出一批这些图像

The simplest code would be:最简单的代码是:

# step 1
filenames = ['im_01.jpg', 'im_02.jpg', 'im_03.jpg', 'im_04.jpg']

# step 2
filename_queue = tf.train.string_input_producer(filenames)

# step 3: read, decode and resize images
reader = tf.WholeFileReader()
filename, content = reader.read(filename_queue)
image = tf.image.decode_jpeg(content, channels=3)
image = tf.cast(image, tf.float32)
resized_image = tf.image.resize_images(image, [224, 224])

# step 4: Batching
image_batch = tf.train.batch([resized_image], batch_size=8)

Based on @olivier-moindrot's answer, but for Tensorflow 2.0+:基于@olivier-moindrot 的回答,但对于 Tensorflow 2.0+:

# step 1
filenames = tf.constant(['im_01.jpg', 'im_02.jpg', 'im_03.jpg', 'im_04.jpg'])
labels = tf.constant([0, 1, 0, 1])

# step 2: create a dataset returning slices of `filenames`
dataset = tf.data.Dataset.from_tensor_slices((filenames, labels))

def im_file_to_tensor(file, label):
    def _im_file_to_tensor(file, label):
        path = f"../foo/bar/{file.numpy().decode()}"
        im = tf.image.decode_jpeg(tf.io.read_file(path), channels=3)
        im = tf.cast(image_decoded, tf.float32) / 255.0
        return im, label
    return tf.py_function(_im_file_to_tensor, 
                          inp=(file, label), 
                          Tout=(tf.float32, tf.uint8))

dataset = dataset.map(im_file_to_tensor)

If you are hitting an issue similar to:如果您遇到类似以下问题:

ValueError: Cannot take the length of Shape with unknown rank ValueError:无法获取未知等级的形状的长度

when passing tf.data.Dataset tensors to model.fit, then take a look at https://github.com/tensorflow/tensorflow/issues/24520 .将 tf.data.Dataset 张量传递给 model.fit 时,请查看https://github.com/tensorflow/tensorflow/issues/24520 A fix for the code snippet above would be:上面代码片段的修复方法是:

def im_file_to_tensor(file, label):
    def _im_file_to_tensor(file, label):
        path = f"../foo/bar/{file.numpy().decode()}"
        im = tf.image.decode_jpeg(tf.io.read_file(path), channels=3)
        im = tf.cast(image_decoded, tf.float32) / 255.0
        return im, label

    file, label = tf.py_function(_im_file_to_tensor, 
                                 inp=(file, label), 
                                 Tout=(tf.float32, tf.uint8))
    file.set_shape([192, 192, 3])
    label.set_shape([])
    return (file, label)

2.0 Compatible Answer using Tensorflow Hub : Tensorflow Hub is a Provision/Product Offered by Tensorflow , which comprises the Models developed by Google, for Text and Image Datasets.使用Tensorflow集线器2.0兼容答Tensorflow Hub是一个提供/产品所提供Tensorflow ,包括由谷歌开发的模型,对文本和图像数据集。

It saves Thousands of Hours of Training Time and Computational Effort , as it reuses the Existing Pre-Trained Model.由于它重用了现有的预训练模型,因此它saves Thousands of Hours of Training Time and Computational Effort

If we have an Image Dataset, we can take the Existing Pre-Trained Models from TF Hub and can adopt it to our Dataset.如果我们有一个图像数据集,我们可以从 TF Hub 中获取现有的预训练模型,并将其应用于我们的数据集。

Code for Re-Training our Image Dataset using the Pre-Trained Model, MobileNet, is shown below:使用预训练模型 MobileNet 重新训练我们的图像数据集的代码如下所示:

import itertools
import os

import matplotlib.pylab as plt
import numpy as np

import tensorflow as tf
import tensorflow_hub as hub

module_selection = ("mobilenet_v2_100_224", 224) #@param ["(\"mobilenet_v2_100_224\", 224)", "(\"inception_v3\", 299)"] {type:"raw", allow-input: true}
handle_base, pixels = module_selection
MODULE_HANDLE ="https://tfhub.dev/google/imagenet/{}/feature_vector/4".format(handle_base)
IMAGE_SIZE = (pixels, pixels)
print("Using {} with input size {}".format(MODULE_HANDLE, IMAGE_SIZE))

BATCH_SIZE = 32 #@param {type:"integer"}

#Here we need to Pass our Dataset

data_dir = tf.keras.utils.get_file(
    'flower_photos',
    'https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz',
    untar=True)

model = tf.keras.Sequential([
    hub.KerasLayer(MODULE_HANDLE, trainable=do_fine_tuning),
    tf.keras.layers.Dropout(rate=0.2),
    tf.keras.layers.Dense(train_generator.num_classes, activation='softmax',
                          kernel_regularizer=tf.keras.regularizers.l2(0.0001))
])
model.build((None,)+IMAGE_SIZE+(3,))
model.summary()

Complete Code for Image Retraining Tutorial can be found in this Github Link .可以在此Github 链接中找到图像重新训练教程的完整代码。

More information about Tensorflow Hub can be found in this TF Blog .更多关于 Tensorflow Hub 的信息可以在这个TF 博客中找到。

The Pre-Trained Modules related to Images can be found in this TF Hub Link .与图像相关的预训练模块可以在这个TF Hub Link 中找到

All the Pre-Trained Modules, related to Images, Text, Videos, etc.. can be found in this TF HUB Modules Link .所有与图像、文本、视频等相关的预训练模块都可以在此TF HUB 模块链接中找到

Finally, this is the Basic Page for Tensorflow Hub .最后,这是Tensorflow Hub基本页面

If your dataset consists of subfolders, you can use ImageDataGenerator it has flow_from_directory it helps to load data from a directory,如果您的数据集由子文件夹组成,您可以使用ImageDataGenerator它具有flow_from_directory它有助于从目录加载数据,

train_batches = ImageDataGenerator().flow_from_directory(
    directory=train_path, target_size=(img_height,img_weight), batch_size=32 ,color_mode="grayscale")

The structure of the folder hierarchy can be as follows,文件夹层次结构的结构可以如下,

train 
    -- cat
    -- dog
    -- moneky

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 读取自己图像集上的训练记录时的TensorFlow InvalidArgumentError - TensorFlow InvalidArgumentError when reading record for training on own image set 如何使用张量流加载我自己的图像数据? - How to load my own Image data with tensorflow? 如何将自己类目录中的图像文件(jpg)提供给Tensorflow Estimator进行训练? - How do I serve image files (jpg) present in directories of their own classes to a Tensorflow Estimator for training? 使用Tensorflow的对象检测API用我自己的数据集训练对象检测器时出错 - Error while training an object detector with my own dataset using Tensorflow's Object Detection API 用于训练的 tensorflow 中的图像输入错误 - wrong input of image in tensorflow for training 在 MNIST 数据集上训练后,如何在 keras 中使用 cnn 预测我自己的图像 - how to predict my own image using cnn in keras after training on MNIST dataset 在Tensorflow中准备自己的数据集 - Preparing my own dataset in Tensorflow 在tensorflow中使用我自己的.csv - Using my own .csv in tensorflow 使用Tensorflow和我们自己的数据进行训练时出错 - Error during training using Tensorflow with our own data 将图像文件作为Tensorflow中的培训数据集加载到目录中 - Load image files in a directory as dataset for training in Tensorflow
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM