简体   繁体   中英

Shape of weight variable Tensorflow

I am trying to learn how to work with tf.data.TFRecordDataset() but I am confused about it. I have a tfrecords file which contains my images(24K) and labels and I have resized all my images to 100x100x3.

First, I loaded my tfrecords file with tf.data.TFRecordDataset and parse the data and other stuff as you can see in my code. Then I wrote a simple model to learn the using of tfrecord file but I get stuck and getting error when trying to run. I have searched on the internet but couldn't find any answer.

Here is my code: Train.py

import tensorflow as tf
import numpy as np
import os
import  glob
NUM_EPOCHS = 10
batch_size = 128
def _parse_function(example_proto):
  features = {"train/image": tf.FixedLenFeature((), tf.string, default_value=""),
            "train/label": tf.FixedLenFeature((), tf.int64, default_value=0)}
  parsed_features = tf.parse_single_example(example_proto, features)
  image = tf.decode_raw(parsed_features['train/image'], tf.float32)
  label = tf.cast(parsed_features['train/label'], tf.int32)
  image = tf.reshape(image, [100, 100, 3])
  image = tf.reshape(image, [100*100*3])

  return image, label

filename = 'train_data1.tfrecords'
dataset = tf.data.TFRecordDataset(filename)
dataset = dataset.map(_parse_function)
#dataset = dataset.repeat(NUM_EPOCHS)
dataset = dataset.batch(batch_size=batch_size)

iterator = dataset.make_initializable_iterator()
image, label = iterator.get_next()


w = tf.get_variable(name='Weights',shape= [30000,3] , initializer=tf.random_normal_initializer(0, 0.01))
b = tf.get_variable(name='Biases', shape= [1, 3],initializer=tf.zeros_initializer())

logits = tf.matmul(image, w) + b

loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits=logits, labels=label, name='Entropy'), name='loss')

optimizer = tf.train.AdamOptimizer(0.001).minimize(loss)

preds = tf.nn.softmax(logits)
correct_preds = tf.equal(tf.argmax(preds, axis=1), tf.argmax(label, axis=1))
accuracy = tf.reduce_sum(tf.cast(correct_preds, tf.float32))



with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    for i in range(2):
        sess.run(iterator.initializer)
        total_loss = 0
        n_batches = 0
        try:
            while True:
                _, l = sess.run([optimizer, loss])
                total_loss += l
                n_batches +=1
        except tf.errors.OutOfRangeError:
            pass
        print('Average loss epoch {0}: {1}'.format(i, total_loss/n_batches))

and this is the output of image:

<tf.Tensor 'IteratorGetNext:0' shape=(?, 30000) dtype=float32>

and label is:

<tf.Tensor 'IteratorGetNext:1' shape=(?,) dtype=int32>

and this Time I got this error:

logits and labels must be same size: logits_size=[128,3] labels_size=[1,128].

and when I reshape label (I think, I am doing wrong here) to [128,1] with label = tf.reshape(label,[128,1]) I will get this error:

imension size must be evenly divisible by 3 but is 128 for 'gradients/Entropy/Reshape_grad/Reshape' (op: 'Reshape') with input shapes: [128,1], [2] and with input tensors computed as partial shapes: input[1] = [?,3].

I am trying to classify my 3 classes: 0 for bike, 1 for bus, and 2 for car.

this is the code how I read my images and label into tfrecords . Code of tfrecordWriter.py

shuffle_data = True
cat_dog_train_path = './Train/*.jpg'
addrs = glob.glob(cat_dog_train_path)
labels = [0 if 'bike' in addr else 1 if 'bus' in addr else 2 for addr in addrs]

if shuffle_data:
    c = list(zip(addrs, labels))
    shuffle(c)
    addrs, labels = zip(*c)


train_addrs = addrs[:]
train_labels = labels[:]
train_shape = []
def load_image(addr):
    img = cv2.imread(addr)
    img = cv2.resize(img, (100, 100), interpolation=cv2.INTER_AREA)
    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    img = img.astype(np.float32)
    return img


def _int64_feature(value):
  return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))
def _bytes_feature(value):
  return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))


train_filename = 'train_data1.tfrecords'
# open the TFRecords file
writer = tf.python_io.TFRecordWriter(train_filename)
for i in range(len(train_addrs)):
    print ('Train data: {}/{}'.format(i+1, len(train_addrs)))
    sys.stdout.flush()
    img = load_image(train_addrs[i])
    label = train_labels[i]
    feature = {'train/label': _int64_feature(label),
               'train/image': _bytes_feature(tf.compat.as_bytes(img.tostring()))}
    example = tf.train.Example(features=tf.train.Features(feature=feature))
    writer.write(example.SerializeToString())

writer.close()
sys.stdout.flush()

thanks

The problem arises from this line:

    w = tf.get_variable(name='Weights',shape= [None, 100, 100, 3] , initializer=tf.random_normal_initializer(0, 0.01))

You specified that your weights have shape shape=[None,100,100,3] which tensorflow can't handle. As the error says "Shape of a new variable (Weights) must be fully defined," so you can't have a None as a dimension of your weights. It looks to me like you confused the shape of the input tensor with the shape of the weights tensor. It also looks like you have not flattened your image anywhere, so your model is not really making any sense. Where you have:

    logits = tf.matmul(image, w) + b

it looks like you are trying to treat this problem as simple logistic regression with the pixels of the image as individual features. That's an ok first approach (but usually one would use a Conv-net on images), but you have to actually flatten your image into a shape of shape=[batchsize,30000] , then your weights would have a shape of shape=[30000,num_labels] so that at the end of the matrix multiplication you will have a final output of shape shape=[batchsize,num_labels] . Based on how your code is written, I feel like you have some fundamental misunderstandings of the mathematics or the operation behind what you are trying to accomplish. Maybe review what exactly it is you are trying to do.

EDIT: The problem here is a fundamental misunderstanding of what the algorithm is doing. The algorithm produces 3 outputs, and so the labels must have 3 corresponding labels to match the 3 outputs. Your labels can not be just one number - 0,1 or 2 depending on the class. Your labels must be 3 numbers, each number telling you whether you the image is in that class or not. In other words, you must label your images with a 3-component (one-hot) vector rather than a 1 component number. Your label for each image should look like this:

[1,0,0] - bike
[0,1,0] - bus
[0,0,1] - car

and so your label's shape (128,3) should be the same as the output shape (128,3) .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM