I am trying out a recent arxiv work called " Factorized CNN ",
which mainly argues that spatially separated convolution (depth-wise convolution), together with channel-wise linear projection(1x1conv), can speed up the convolution operation.
this is the figure for their conv layer architecture
I found out that I can implement this architecture with tf.nn.depthwise_conv2d and 1x1 convolution, or with tf.nn.separable_conv2d.
below is my implementation:
#conv filter for depthwise convolution depthwise_filter = tf.get_variable("depth_conv_w", [3,3,64,1], initializer=tf.random_normal_initializer(stddev=np.sqrt(2.0/9/32))) #conv filter for linear channel projection pointwise_filter = tf.get_variable("point_conv_w", [1,1,64,64], initializer=tf.random_normal_initializer(stddev=np.sqrt(2.0/1/64))) conv_b = tf.get_variable("conv_b", [64], initializer=tf.constant_initializer(0)) #depthwise convolution, with multiplier 1 conv_tensor = tf.nn.relu(tf.nn.depthwise_conv2d(tensor, depthwise_filter, [1,1,1,1], padding='SAME')) #linear channel projection with 1x1 convolution conv_tensor = tf.nn.bias_add(tf.nn.conv2d(conv_tensor, pointwise_filter, [1,1,1,1], padding='VALID'), conv_b) #residual tensor = tf.add(tensor, conv_tensor)
This should be around 9 times faster than the original 3x3x64 -> 64 channel convolution.
However, I cannot experience any performance improvement.
I must assume that I am doing this wrong, or there's something wrong with tensorflow's implementation.
Since there is few example using depthwise_conv2d, I am leaving this question here.
Is this slow speed normal? or is there any mistake?
目前,depthwise conv2d的实现未充分利用GPU的并行功能,您需要等待将来更快的实现,例如,在caffe中,此内核https:// github中存在更快的第三方隐含功能。 com / yonghenglh6 / DepthwiseConvolution
Depthwise convolutions provide significant performance benefits owing to the reduction in both parameters and mult-adds. However, training depthwise convolution layers with GPUs is slow in current deep learning frameworks because their implementations cannot fully utilize the GPU capacity.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.