[英]Tensorflow CNN image augmentation pipeline
I'm trying to learn the new Tensorflow APIs and I am a bit lost on where to get a handle on my input batch tensors so I can manipulate and augment them with for example tf.image. 我正在尝试学习新的Tensorflow API,但对于在哪里获取输入批处理张量的句柄,我有点迷失了,因此我可以使用tf.image等操作和扩充它们。
This is the my current network & pipeline: 这是我当前的网络和管道:
trainX, testX, trainY, testY = read_data()
# trainX [num_image, height, width, channels], these are numpy arrays
#...
train_dataset = tf.data.Dataset.from_tensor_slices((trainX, trainY))
test_dataset = tf.data.Dataset.from_tensor_slices((testX, testY))
#...
iterator = tf.data.Iterator.from_structure(train_dataset.output_types,
train_dataset.output_shapes)
features, labels = iterator.get_next()
train_init_op = iterator.make_initializer(train_dataset)
test_init_op = iterator.make_initializer(test_dataset)
#...defining cnn architecture...
# In the train loop
TrainLoop {
sess.run(train_init_op) # switching to train data
sess.run(train_step, ...) # running a train step
#...
sess.run(test_init_op) # switching to test data
test_loss = sess.run(loss, ...) # printing test loss after epoch
}
I'm using the Dataset API creating 2 datasets so that in the trainloop I can calculate the train and test loss and log them. 我正在使用Dataset API创建2个数据集,以便在trainloop中可以计算火车并测试损失并记录它们。
Where in this pipeline would I manipulate and distort my input batch of images? 我将在此管道中的哪个位置处理和扭曲输入的图像? I'm not creating any tf.placeholders for my trainX input batches so I can't manipulate them with tf.image because for example
tf.image.flip_up_down
requires a 3-D or 4-D tensor. 我没有为trainX输入批次创建任何tf.placeholders,因此我无法使用tf.image来操纵它们,因为例如
tf.image.flip_up_down
需要3D或4D张量。
There's a really good article and talk released recently that go over the API in a lot more detail than my response here. 最近发布了一篇非常不错的文章和演讲 ,比我在这里的回复更详细地介绍了API。 Here's a brief example:
这是一个简单的示例:
import tensorflow as tf
import numpy as np
def read_data():
n_train = 100
n_test = 50
height = 20
width = 30
channels = 3
trainX = (np.random.random(
size=(n_train, height, width, channels)) * 255).astype(np.uint8)
testX = (np.random.random(
size=(n_test, height, width, channels))*255).astype(np.uint8)
trainY = (np.random.random(size=(n_train,))*10).astype(np.int32)
testY = (np.random.random(size=(n_test,))*10).astype(np.int32)
return trainX, testX, trainY, testY
trainX, testX, trainY, testY = read_data()
# trainX [num_image, height, width, channels], these are numpy arrays
train_dataset = tf.data.Dataset.from_tensor_slices((trainX, trainY))
test_dataset = tf.data.Dataset.from_tensor_slices((testX, testY))
def map_single(x, y):
print('Map single:')
print('x shape: %s' % str(x.shape))
print('y shape: %s' % str(y.shape))
x = tf.image.per_image_standardization(x)
# Consider: x = tf.image.random_flip_left_right(x)
return x, y
def map_batch(x, y):
print('Map batch:')
print('x shape: %s' % str(x.shape))
print('y shape: %s' % str(y.shape))
# Note: this flips ALL images left to right. Not sure this is what you want
# UPDATE: looks like tf documentation is wrong and you need a 3D tensor?
# return tf.image.flip_left_right(x), y
return x, y
batch_size = 32
train_dataset = train_dataset.repeat().shuffle(100)
train_dataset = train_dataset.map(map_single, num_parallel_calls=8)
train_dataset = train_dataset.batch(batch_size)
train_dataset = train_dataset.map(map_batch)
train_dataset = train_dataset.prefetch(2)
test_dataset = test_dataset.map(
map_single, num_parallel_calls=8).batch(batch_size).map(map_batch)
test_dataset = test_dataset.prefetch(2)
iterator = tf.data.Iterator.from_structure(train_dataset.output_types,
train_dataset.output_shapes)
features, labels = iterator.get_next()
train_init_op = iterator.make_initializer(train_dataset)
test_init_op = iterator.make_initializer(test_dataset)
with tf.Session() as sess:
sess.run(train_init_op)
feat, lab = sess.run((features, labels))
print(feat.shape)
print(lab.shape)
sess.run(test_init_op)
feat, lab = sess.run((features, labels))
print(feat.shape)
print(lab.shape)
A few notes: 一些注意事项:
tf.data.Dataset.from_generator
. tf.data.Dataset.from_generator
。 This can lead to slow shuffle times if your shuffle buffer is large. keys
tensor entirely into memory - it might just be the indices of each example - then map
that key value to data values using tf.py_func
. keys
张量完全加载到内存中-它可能只是每个示例的索引-然后使用tf.py_func
将键值map
到数据值。 This is slightly less efficient than converting to tfrecords
, but with prefetching
it likely won't affect performance. tfrecords
效率tfrecords
,但是通过prefetching
它可能不会影响性能。 Since the shuffling is done before the mapping, you only have to load shuffle_buffer
keys into memory, rather than shuffle_buffer
examples. shuffle_buffer
键加载到内存中,而不是shuffle_buffer
示例。 tf.data.Dataset.map
either before or after the batch operation, depending on whether or not you want to apply a batch-wise operation (something working on a 4D image tensor) or element-wise operation (3D image tensor). tf.data.Dataset.map
,这取决于您是否要应用批量操作(在4D图像张量上起作用)或元素级操作( 3D图像张量)。 Note it looks like the documentation for tf.image.flip_left_right
is out of date, since I get an error when I try and use a 4D tensor. tf.image.flip_left_right
的文档tf.image.flip_left_right
已过时,因为在尝试使用4D张量时出现错误。 If you want to augment you data randomly, use tf.image.random_flip_left_right
rather than tf.image.flip_left_right
. tf.image.random_flip_left_right
而不是tf.image.flip_left_right
。 tf.estimator.Estimator
(or wouldn't mind converting your code to using it), then check out tf.estimator.train_and_evaluate
for an in-built way of switching between datasets. tf.estimator.Estimator
(或不介意将代码转换为使用它),请查看tf.estimator.train_and_evaluate
,以了解tf.estimator.train_and_evaluate
在数据集之间进行切换的内置方法。 shuffle
/ repeat
methods. shuffle
/ repeat
的方法。 See the article for notes on efficiencies.
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.