Why the eminst data is converted from (28*28) to [-1, 784] instead of [0,784] in image classification problem?

Question

This is code snippet from https://www.tensorflow.org/federated/tutorials/federated_learning_for_image_classification

The example is of image classification problem using federated learning. Below function is pre-processing function of emnist data (which is of size 28*28). Can anyone help to understand why the data was reshaped to -1 to 784? as far as I understand, we convert it from two dimensional to one dimensional array because it is easier to process. But I am not sure why -1 was included. Isn't it 0 o 784 would have been enough?

NUM_CLIENTS = 10
NUM_EPOCHS = 5
BATCH_SIZE = 20
SHUFFLE_BUFFER = 100
PREFETCH_BUFFER=10

def preprocess(dataset):

  def batch_format_fn(element):
    """Flatten a batch `pixels` and return the features as an `OrderedDict`."""
    return collections.OrderedDict(
        x=tf.reshape(element['pixels'], **[-1, 784]**),
        y=tf.reshape(element['label'], **[-1, 1]**))

  return dataset.repeat(NUM_EPOCHS).shuffle(SHUFFLE_BUFFER).batch(
      BATCH_SIZE).map(batch_format_fn).prefetch(PREFETCH_BUFFER)

Answer 1

The -1 here indicates that the size of this dimension should be inferred, and should be considered to be a batch dimension. Since the MNIST data is 28 x 28 pixels, if we have N examples of this data, we will have N x 28 x 28 = N x 784 total pixels here. The -1 here allows this map function to be agnostic to batch size.

If we were to apply this map function before batching, we would be able to hardcode the -1 as a 1 instead--but this would be an antipattern for writing tf.data.Dataset pipelines generally, see the vectorized mapping section in the guidance on writing performant tf.data.Dataset pipelines.

We would not be able to use a 0 here, for this would only work if there were exactly 0 examples in the element here; as the equation above indicates, this would hardcode an assumption that there are 0 pixels in element .

Answer 2

tf.reshape() works in a manner similar to numpy.reshape(). Thus, I'll like to explain it using numpy.reshape() to keep things simple. Supposed you have a dataset represented by a numpy array X=[[1,2],[3,4],[5,6]]. This a 2-Dimensional array, and can be represented by shape (3,2). Note while representing a shape using parentheses, first element (in this case 3) represents the number of rows (ie number of samples), while the second element represents the number of columns (number of features). When we use -1 in shape, it means that element (row or columns) is not affected. For example, X.reshape(-1,1), will not affect the rows, however, will change the number of columns to 1. This resultant X will be X=[[1],[2],[3],[4],[5],[6]]. This is shown below: import numpy as np X= np.array([[1,2],[3,4],[5,6]]) print("The initial dimensions are:",X.ndim) print("The initial shape is:",X.shape) print('The initial dataset is:',X) X=X.reshape(-1,1) print("The new dimensions are:",X.ndim) print("The new shape is:",X.shape) print('The new dataset is:',X)

The initial dimensions are: 2
The initial shape is : (3, 2)
The initial dataset is: [[1 2][3 4][5 6]]
The new dimensions are: 2
The new shape is : (6, 1)
The new dataset is: [[1][2][3][4][5][6]]

Similarly, x=tf.reshape(element['pixels'], [-1, 784]), flattens the 28x28 pixel values to 784 columns. The number of rows (representing the number of samples) remain unaffected due to -1. You cannot use 0, otherwise that would mean zero rows (samples).

Why the eminst data is converted from (28*28) to [-1, 784] instead of [0,784] in image classification problem?

Question

2 answers

solution1
3 ACCPTED 2020-04-04 19:39:49

solution2
-1 2020-07-01 09:10:56

Why the eminst data is converted from (28*28) to [-1, 784] instead of [0,784] in image classification problem?

Question

2 answers

solution1 3 ACCPTED 2020-04-04 19:39:49

solution2 -1 2020-07-01 09:10:56

solution1
3 ACCPTED 2020-04-04 19:39:49

solution2
-1 2020-07-01 09:10:56