I would like to use the function tf.nn.conv2d()
on a single image example, but the TensorFlow documentation seems to only mention applying this transformation to a batch of images.
The docs mention that the input image must be of shape [batch, in_height, in_width, in_channels]
and the kernel must be of shape [filter_height, filter_width, in_channels, out_channels]
. However, what is the most straightforward way to achieve 2D convolution with input shape [in_height, in_width, in_channels]
?
Here is an example of the current approach, where img
has shape (height, width, channels):
img = tf.random_uniform((10,10,3)) # a single image
img = tf.nn.conv2d([img], kernel)[0] # creating a batch of 1, then indexing the single example
I am reshaping the input as follows:
[in_height, in_width, in_channels]->[1, in_height, in_width, in_channels]->[in_height, in_width, in_channels]
This feels like an unnecessary and costly operation when I am only interested in transforming one example.
Is there a simple/standard way to do this that doesn't involve reshaping?
AFAIK there is no way around it. It seems ( here and here ) that the first operation creates a copy (someone correct me if I'm wrong). You may use tf.expand_dims
instead though, it's IMO more readable because of it's verbosity.
On the other hand, taking 0
element from the tensor should not perform a copy in this case and is almost free.
Most importantly , except for a little inconvenience with syntax (eg [0]
) those operations definitely are not costly , especially in the context of performing convolution.
BTW. Other ready alternative layers like the ones in tf.keras
, require batch as first dimension as well.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.