简体繁体 English

非平方卷积核大小

[英]Non-squared convolution kernel size

原文 2020-01-28 09:59:05 9 3 deep-learning/ conv-neural-network

It is very common to use squared_sized kernels for convolutional neural network, ie (3,3), (5,5), etc.卷积神经网络使用squared_sized核是很常见的，即(3,3)、(5,5)等。

What would be the cons and pros of using non-squared kernel sizes?使用非平方内核大小的优缺点是什么？ meaning (3,7), (3,9), etc.表示 (3,7)、(3,9) 等。

3 个解决方案

I can not think of any cons.我想不出任何缺点。 It really depends what you wannado and what your data is.这实际上取决于您想要做什么以及您的数据是什么。

When you use a squared size kernel, you use that kernel to translate that area to one point in the output of the conv.当您使用平方大小的内核时，您可以使用该内核将该区域转换为 conv 输出中的一个点。 So using a square, each point in the output are obtained from a fair set of weighted neighbours of an input point( same number of vertical neighbours as the horizontal one).因此，使用正方形，输出中的每个点都是从输入点的一组公平的加权邻居中获得的（垂直邻居的数量与水平邻居的数量相同）。

However, if you use a non square kernel size, let's say a 3×9 kernel size,you map each input point using 3 times more of horizontal than vertical ( or vice versa).但是，如果您使用非方形内核大小，例如 3×9 内核大小，则您使用水平比垂直多 3 倍来映射每个输入点（反之亦然）。 Depending on the nature of the data, that might simplify your training process and increase the accuracy.根据数据的性质，这可能会简化您的训练过程并提高准确性。 ( if you are trying to detect very large thin crocodiles for example^_^). （例如，如果您试图检测非常大的细鳄鱼^_^）。 After all, these are all my opinions, not a 100% scientific facts.毕竟，这些都是我的观点，不是100%的科学事实。

The reason behind squared sized kernels is that you in general have no idea what orientation the learned features will have.平方大小内核背后的原因是您通常不知道学习的特征将具有什么方向。 So you don't want to restrict the network.所以你不想限制网络。 The optimal shape for a filter would be a circle, so it can learn any feature with an arbitrary orientation inside some region with a given radius.过滤器的最佳形状是圆形，因此它可以学习具有给定半径的某个区域内任意方向的任何特征。 Since this is not really feasible because of implementation problems a square it the next best shape.由于实现问题，这不是真正可行的，因此正方形是下一个最佳形状。

If you would know eg that all learned features will have the ratio 1x3 (heightxwidth) you could use a kernel size like 2x6.例如，如果您知道所有学习的特征的比率为 1x3（高x宽），您可以使用像 2x6 这样的内核大小。 But you just don't know this.但你只是不知道这一点。 Even if you say that the objects you want to detect/classify look like this, it doesn't translate to the features the network will learn to identify it.即使你说你想要检测/分类的对象看起来像这样，它也不会转化为网络将学习识别它的特征。 The whole advantage is that you can let the network learn the features and imo you should restrict this as little as possible.整个优点是您可以让网络学习这些功能，并且您应该尽可能少地限制它。

But I don't want to discourage you.但我不想让你气馁。 Deep learning is a lot of experimentation and trial and error.深度学习需要大量的实验和反复试验。 So just try it out and see for yourself.因此，请尝试一下，亲自看看。 Maybe for some kind of problem it actually performs better, who knows.也许对于某种问题，它实际上表现得更好，谁知道呢。

You can use whatever size of kernel you like.您可以使用任何您喜欢的内核大小。 The kernel is not necessary to be a square especially when you want to pay more attention to process along a specific orientation.内核不一定是正方形，尤其是当您想更加关注沿特定方向的过程时。 In fact, moving average along a specific axis in a image is a simple filter with rectangular shape.实际上，图像中沿特定轴的移动平均值是一个简单的矩形滤波器。