I am using MNIST data which has an image of size 28X28 pixel I using padding to convert it to 32X32 pixels as shown below:
tf.pad(tensor=X_train, paddings=[[0, 0], [2,2], [2,2]])
Output is coming out to be correct.
TensorShape([60000, 32, 32])
I want to understand what exactly does [0, 0], [2, 2] and [2, 2] mean? What is the top, bottom, left, and right padding here? What do numbers depict?
From https://www.tensorflow.org/api_docs/python/tf/pad :
This operation pads a tensor according to the paddings you specify. paddings is an integer tensor with shape [n, 2], where n is the rank of tensor. For each dimension D of input, paddings[D, 0] indicates how many values to add before the contents of tensor in that dimension, and paddings[D, 1] indicates how many values to add after the contents of tensor in that dimension.
Here you have a rank 3 tensor. The dimension 0 is the batch dimension along which you have 28 x 28 tensors. Dimensions 1 and 2 correspond to the height and width of the input tensor. In those dimensions, you add 2 elements before and after the original row/column, which makes the output shape = 28 + 2 + 2 = 32.
For example, the top and bottom padding is specified by paddings[1]
which pads the 28 x 28 tensor with 2 zeros at the top and 2 zeros at the bottom. Similarly, paddings[2]
provides the left and right padding amounts.
Look at this example for a clearer understanding:
>>> import tensorflow as tf
# create a random tensor of shape 2 x 2 x 2
X = tf.random.uniform(shape=[2, 2, 2])
>>> X
<tf.Tensor: shape=(2, 2, 2), dtype=float32, numpy=
array([[[0.60002756, 0.5554304 ],
[0.15563118, 0.75253165]],
[[0.983318 , 0.4908601 ],
[0.16791439, 0.55565095]]], dtype=float32)>
# pad along batch dimension
>>> tf.pad(tensor = X, paddings = [[1, 1], [0, 0], [0, 0]])
<tf.Tensor: shape=(4, 2, 2), dtype=float32, numpy=
array([[[0. , 0. ],
[0. , 0. ]],
[[0.60002756, 0.5554304 ],
[0.15563118, 0.75253165]],
[[0.983318 , 0.4908601 ],
[0.16791439, 0.55565095]],
[[0. , 0. ],
[0. , 0. ]]], dtype=float32)>
# pad along height/rows
>>> tf.pad(tensor = X, paddings = [[0, 0], [1, 1], [0, 0]])
<tf.Tensor: shape=(2, 4, 2), dtype=float32, numpy=
array([[[0. , 0. ],
[0.60002756, 0.5554304 ],
[0.15563118, 0.75253165],
[0. , 0. ]],
[[0. , 0. ],
[0.983318 , 0.4908601 ],
[0.16791439, 0.55565095],
[0. , 0. ]]], dtype=float32)>
# pad along width/columns
>>> tf.pad(tensor = X, paddings = [[0, 0], [0, 0], [1, 1]])
<tf.Tensor: shape=(2, 2, 4), dtype=float32, numpy=
array([[[0. , 0.60002756, 0.5554304 , 0. ],
[0. , 0.15563118, 0.75253165, 0. ]],
[[0. , 0.983318 , 0.4908601 , 0. ],
[0. , 0.16791439, 0.55565095, 0. ]]], dtype=float32)>
Note how the tensor shapes change above after each kind of padding operation.
Since in your case you do not want redundant zeroed samples along the batch, you have [0, 0] along the batch dimension.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.