简体   繁体   English

numpy中二维的跨步卷积

[英]Strided convolution of 2D in numpy

I tried to implement strided convolution of a 2D array using for loop ie我尝试使用 for 循环实现二维数组的跨步卷积,即

arr = np.array([[2,3,7,4,6,2,9],
                [6,6,9,8,7,4,3],
                [3,4,8,3,8,9,7],
                [7,8,3,6,6,3,4],
                [4,2,1,8,3,4,6],
                [3,2,4,1,9,8,3],
                [0,1,3,9,2,1,4]])

arr2 = np.array([[3,4,4],
                 [1,0,2],
                 [-1,0,3]])

def stride_conv(arr1,arr2,s,p):
    beg = 0
    end = arr2.shape[0]
    final = []
    for i in range(0,arr1.shape[0]-1,s):
        k = []
        for j in range(0,arr1.shape[0]-1,s):
            k.append(np.sum(arr1[beg+i : end+i, beg+j:end+j] * (arr2)))
        final.append(k)

    return np.array(final)

stride_conv(arr,arr2,2,0)

This results in 3*3 array:这导致 3*3 数组:

array([[ 91, 100,  88],
       [ 69,  91, 117],
       [ 44,  72,  74]])

Is there a numpy function or scipy function to do the same?是否有 numpy function 或 scipy function 来做同样的事情? My approach is not that good.我的方法不是很好。 How can I vectorize this?我怎样才能矢量化这个?

Ignoring the padding argument and trailing windows that won't have enough lengths for convolution against the second array, here's one way with np.lib.stride_tricks.as_strided -忽略填充参数和尾随窗口,这些窗口的长度不足以与第二个数组进行卷积,这是使用np.lib.stride_tricks.as_strided的一种方法 -

def strided4D(arr,arr2,s):
    strided = np.lib.stride_tricks.as_strided
    s0,s1 = arr.strides
    m1,n1 = arr.shape
    m2,n2 = arr2.shape    
    out_shp = (1+(m1-m2)//s, m2, 1+(n1-n2)//s, n2)
    return strided(arr, shape=out_shp, strides=(s*s0,s*s1,s0,s1))

def stride_conv_strided(arr,arr2,s):
    arr4D = strided4D(arr,arr2,s=s)
    return np.tensordot(arr4D, arr2, axes=((2,3),(0,1)))

Alternatively, we can use the scikit-image built-in view_as_windows to get those windows elegantly , like so -或者,我们可以使用 scikit-image 内置的view_as_windows优雅地获取这些窗口,就像这样 -

from skimage.util.shape import view_as_windows

def strided4D_v2(arr,arr2,s):
    return view_as_windows(arr, arr2.shape, step=s)

I think we can do a "valid" fft convolution and pick out only those results at strided locations, like this:我认为我们可以做一个“有效”的 fft 卷积并只在跨步位置挑选那些结果,如下所示:

def strideConv(arr,arr2,s):
    cc=scipy.signal.fftconvolve(arr,arr2[::-1,::-1],mode='valid')
    idx=(np.arange(0,cc.shape[1],s), np.arange(0,cc.shape[0],s))
    xidx,yidx=np.meshgrid(*idx)
    return cc[yidx,xidx]

This gives same results as other people's answers.这给出了与其他人的答案相同的结果。 But I guess this only works if the kernel size is odd numbered.但我想这仅在内核大小为奇数时才有效。

Also I've flipped the kernel in arr2[::-1,::-1] just to stay consistent with others, you may want to omit it depending on context.此外,我在arr2[::-1,::-1]翻转了内核只是为了与其他人保持一致,您可能希望根据上下文省略它。

UPDATE:更新:

We currently have a few different ways of doing 2D or 3D convolution using numpy and scipy alone, and I thought about doing some comparisons to give some idea on which one is faster on data of different sizes.我们目前有几种不同的方法可以单独使用 numpy 和 scipy 进行 2D 或 3D 卷积,我考虑进行一些比较,以了解哪种方法在不同大小的数据上更快。 I hope this won't be regarded as off-topic.我希望这不会被视为题外话。

Method 1: FFT convolution (using scipy.signal.fftconvolve ):方法 1:FFT 卷积(使用scipy.signal.fftconvolve ):

def padArray(var,pad,method=1):
    if method==1:
        var_pad=numpy.zeros(tuple(2*pad+numpy.array(var.shape[:2]))+var.shape[2:])
        var_pad[pad:-pad,pad:-pad]=var
    else:
        var_pad=numpy.pad(var,([pad,pad],[pad,pad])+([0,0],)*(numpy.ndim(var)-2),
                mode='constant',constant_values=0)
    return var_pad

def conv3D(var,kernel,stride=1,pad=0,pad_method=1):
    '''3D convolution using scipy.signal.convolve.
    '''
    var_ndim=numpy.ndim(var)
    kernel_ndim=numpy.ndim(kernel)
    stride=int(stride)

    if var_ndim<2 or var_ndim>3 or kernel_ndim<2 or kernel_ndim>3:
        raise Exception("<var> and <kernel> dimension should be in 2 or 3.")

    if var_ndim==2 and kernel_ndim==3:
        raise Exception("<kernel> dimension > <var>.")

    if var_ndim==3 and kernel_ndim==2:
        kernel=numpy.repeat(kernel[:,:,None],var.shape[2],axis=2)

    if pad>0:
        var_pad=padArray(var,pad,pad_method)
    else:
        var_pad=var

    conv=fftconvolve(var_pad,kernel,mode='valid')

    if stride>1:
        conv=conv[::stride,::stride,...]

    return conv

Method 2: Special conv (see this anwser ):方法 2:特殊转换(请参阅此 anwser ):

def conv3D2(var,kernel,stride=1,pad=0):
    '''3D convolution by sub-matrix summing.
    '''
    var_ndim=numpy.ndim(var)
    ny,nx=var.shape[:2]
    ky,kx=kernel.shape[:2]

    result=0

    if pad>0:
        var_pad=padArray(var,pad,1)
    else:
        var_pad=var

    for ii in range(ky*kx):
        yi,xi=divmod(ii,kx)
        slabii=var_pad[yi:2*pad+ny-ky+yi+1:1, xi:2*pad+nx-kx+xi+1:1,...]*kernel[yi,xi]
        if var_ndim==3:
            slabii=slabii.sum(axis=-1)
        result+=slabii

    if stride>1:
        result=result[::stride,::stride,...]

    return result

Method 3: Strided-view conv, as suggested by Divakar:方法 3:跨步视图转换,如 Divakar 所建议的:

def asStride(arr,sub_shape,stride):
    '''Get a strided sub-matrices view of an ndarray.

    <arr>: ndarray of rank 2.
    <sub_shape>: tuple of length 2, window size: (ny, nx).
    <stride>: int, stride of windows.

    Return <subs>: strided window view.

    See also skimage.util.shape.view_as_windows()
    '''
    s0,s1=arr.strides[:2]
    m1,n1=arr.shape[:2]
    m2,n2=sub_shape[:2]

    view_shape=(1+(m1-m2)//stride,1+(n1-n2)//stride,m2,n2)+arr.shape[2:]
    strides=(stride*s0,stride*s1,s0,s1)+arr.strides[2:]
    subs=numpy.lib.stride_tricks.as_strided(arr,view_shape,strides=strides)

    return subs

def conv3D3(var,kernel,stride=1,pad=0):
    '''3D convolution by strided view.
    '''
    var_ndim=numpy.ndim(var)
    kernel_ndim=numpy.ndim(kernel)

    if var_ndim<2 or var_ndim>3 or kernel_ndim<2 or kernel_ndim>3:
        raise Exception("<var> and <kernel> dimension should be in 2 or 3.")

    if var_ndim==2 and kernel_ndim==3:
        raise Exception("<kernel> dimension > <var>.")

    if var_ndim==3 and kernel_ndim==2:
        kernel=numpy.repeat(kernel[:,:,None],var.shape[2],axis=2)

    if pad>0:
        var_pad=padArray(var,pad,1)
    else:
        var_pad=var

    view=asStride(var_pad,kernel.shape,stride)
    #return numpy.tensordot(aa,kernel,axes=((2,3),(0,1)))
    if numpy.ndim(kernel)==2:
        conv=numpy.sum(view*kernel,axis=(2,3))
    else:
        conv=numpy.sum(view*kernel,axis=(2,3,4))

    return conv

I did 3 sets of comparisons:我做了3组比较:

  1. convolution on 2D data, with different input size and different kernel size, stride=1, pad=0.对二维数据进行卷积,输入大小不同,核大小不同,stride=1,pad=0。 Results below (color as time used for convolution repeated for 10 times):结果如下(颜色为用于卷积的时间重复 10 次):

在此处输入图片说明

So "FFT conv" is in general the fastest.所以“FFT conv”通常是最快的。 "Special conv" and "Stride-view conv" get slow as kernel size increases, but decreases again as it approaches the size of input data. “Special conv”和“Stride-view conv”随着内核大小的增加而变慢,但随着接近输入数据的大小而再次减小。 The last subplot shows the fastest method, so the big triangle of purple indicates FFT being the winner, but note there is a thin green column on the left side (probably too small to see, but it's there), suggesting that "Special conv" has advantage for very small kernels (smaller than about 5x5).最后一个子图显示了最快的方法,所以紫色的大三角形表示 FFT 是赢家,但请注意左侧有一个细的绿色列(可能太小看不到,但它在那里),表明“特殊转换”对于非常小的内核(小于大约 5x5)具有优势。 And when kernel size approaches input, "stride-view conv" is fastest (see the diagonal line).当内核大小接近输入时,“stride-view conv”是最快的(见对角线)。

Comparison 2: convolution on 3D data.比较 2:对 3D 数据进行卷积。

Setup: pad=0, stride=2, input dimension= nxnx5 , kernel shape= fxfx5 .设置:pad=0,stride=2,输入维度= nxnx5 ,内核形状= fxfx5

I skipped computations of "Special Conv" and "Stride-view conv" when kernel size is in the mid of input.当内核大小在输入的中间时,我跳过了“Special Conv”和“Stride-view conv”的计算。 Basically "Special Conv" shows no advantage now, and "Stride-view" is faster than FFT for both small and large kernels.基本上“Special Conv”现在没有任何优势,“Stride-view”对于小内核和大内核都比 FFT 快。

在此处输入图片说明

One additional note: when sizes goes above 350, I notice considerable memory usage peaks for the "Stride-view conv".另一个注意事项:当大小超过 350 时,我注意到“跨步视图转换”的内存使用量达到了可观的峰值。

Comparison 3: convolution on 3D data with larger stride.比较 3:对 3D 数据进行更大步幅的卷积。

Setup: pad=0, stride=5, input dimension= nxnx10 , kernel shape= fxfx10 .设置:pad=0,stride=5,输入维度= nxnx10 ,内核形状= fxfx10

This time I omitted the "Special Conv".这次我省略了“特别会议”。 For a larger area "Stride-view conv" surpasses FFT, and last subplots shows that the difference approaches 100 %.对于更大的区域,“Stride-view conv”超过了 FFT,最后的子图显示差异接近 100%。 Probably because as the stride goes up, the FFT approach will have more wasted numbers so the "stride-view" gains more advantages for small and large kernels.可能是因为随着步幅的增加,FFT 方法会浪费更多的数字,因此“步幅视图”对于小型和大型内核获得更多优势。

在此处输入图片说明

How about using signal.convolve2d from scipy ?如何使用signal.convolve2dscipy

My approach is similar to Jason's one but using indexing.我的方法类似于 Jason 的方法,但使用索引。

def strideConv(arr, arr2, s):
    return signal.convolve2d(arr, arr2[::-1, ::-1], mode='valid')[::s, ::s]

Note that the kernal has to be reversed.请注意,内核必须反转。 For details, please see discussion here and here .有关详细信息,请参阅此处此处的讨论。 Otherwise use signal.correlate2d .否则使用signal.correlate2d

Examples:例子:

 >>> strideConv(arr, arr2, 1)
 array([[ 91,  80, 100,  84,  88],
        [ 99, 106, 126,  92,  77],
        [ 69,  98,  91,  93, 117],
        [ 80,  79,  87,  93,  61],
        [ 44,  72,  72,  63,  74]])
 >>> strideConv(arr, arr2, 2)
 array([[ 91, 100,  88],
        [ 69,  91, 117],
        [ 44,  72,  74]])

Here is an O(N^d (log N)^d) fft-based approach.这是一种基于 O(N^d (log N)^d) 的基于 fft 的方法。 The idea is to chop up both operands into strides-spaced grids at all offsets modulo strides, do the conventional fft convolution between grids of corresponding offsets and then pointwise sum the results.这个想法是在所有偏移模步长处将两个操作数切成步幅间隔的网格,在相应偏移量的网格之间进行常规的 fft 卷积,然后对结果进行逐点求和。 It is a bit index-heavy but I'm afraid that can't be helped:它有点索引重,但恐怕无济于事:

import numpy as np
from numpy.fft import fftn, ifftn

def strided_conv_2d(x, y, strides):
    s, t = strides
    # consensus dtype
    cdt = (x[0, 0, ...] + y[0, 0, ...]).dtype
    xi, xj = x.shape
    yi, yj = y.shape
    # round up modulo strides
    xk, xl, yk, yl = map(lambda a, b: -a//b * -b, (xi,xj,yi,yj), (s,t,s,t))
    # zero pad to avoid circular convolution
    xp, yp = (np.zeros((xk+yk, xl+yl), dtype=cdt) for i in range(2))
    xp[:xi, :xj] = x
    yp[:yi, :yj] = y
    # fold out strides
    xp = xp.reshape((xk+yk)//s, s, (xl+yl)//t, t)
    yp = yp.reshape((xk+yk)//s, s, (xl+yl)//t, t)
    # do conventional fft convolution
    xf = fftn(xp, axes=(0, 2))
    yf = fftn(yp, axes=(0, 2))
    result = ifftn(xf * yf.conj(), axes=(0, 2)).sum(axis=(1, 3))
    # restore dtype
    if cdt in (int, np.int_, np.int64, np.int32):
        result = result.real.round()
    return result.astype(cdt)

arr = np.array([[2,3,7,4,6,2,9],
                [6,6,9,8,7,4,3],
                [3,4,8,3,8,9,7],
                [7,8,3,6,6,3,4],
                [4,2,1,8,3,4,6],
                [3,2,4,1,9,8,3],
                [0,1,3,9,2,1,4]])

arr2 = np.array([[3,4,4],
                 [1,0,2],
                 [-1,0,3]])

print(strided_conv_2d(arr, arr2, (2, 2)))

Result:结果:

[[ 91 100  88  23   0  29]
 [ 69  91 117  19   0  38]
 [ 44  72  74  17   0  22]
 [ 16  53  26  12   0   0]
 [  0   0   0   0   0   0]
 [ 19  11  21  -9   0   6]]

As far as I know, there is no direct implementation of convolution filter in numpy or scipy that supports stride and padding so I think it's better to use a DL package such as torch or tensorflow, then cast the final result to numpy.据我所知,在支持 stride 和 padding 的 numpy 或 scipy 中没有直接实现卷积过滤器,所以我认为最好使用诸如 torch 或 tensorflow 之类的 DL 包,然后将最终结果转换为 numpy。 a torch implementation might be:火炬实施可能是:

import torch
import torch.nn.functional as F

arr = torch.tensor(np.expand_dims(arr, axis=(0,1))
arr2 = torch.tensor(np.expand_dims(arr2, axis=(0,1))
output = F.conv2d(arr, arr2, stride=2, padding=0)
output = output.numpy().squeeze()

output>
array([[ 91, 100,  88],
       [ 69,  91, 117],
       [ 44,  72,  74]])

Convolution which supports strides and dilation.支持步幅和扩张的卷积。 numpy.lib.stride_tricks.as_strided is used.使用numpy.lib.stride_tricks.as_strided

import numpy as np
from numpy.lib.stride_tricks import as_strided

def conv_view(X, F_s, dr, std):
    X_s = np.array(X.shape)
    F_s = np.array(F_s)
    dr = np.array(dr)
    Fd_s = (F_s - 1) * dr + 1
    if np.any(Fd_s > X_s):
        raise ValueError('(Dilated) filter size must be smaller than X')
    std = np.array(std)
    X_ss = np.array(X.strides)
    Xn_s = (X_s - Fd_s) // std + 1
    Xv_s = np.append(Xn_s, F_s)
    Xv_ss = np.tile(X_ss, 2) * np.append(std, dr)
    return as_strided(X, Xv_s, Xv_ss, writeable=False)

def convolve_stride(X, F, dr=None, std=None):
    if dr is None:
        dr = np.ones(X.ndim, dtype=int)
    if std is None:
        std = np.ones(X.ndim, dtype=int)
    if not (X.ndim == F.ndim == len(dr) == len(std)):
        raise ValueError('X.ndim, F.ndim, len(dr), len(std) must be the same')
    Xv = conv_view(X, F.shape, dr, std)
    return np.tensordot(Xv, F, axes=X.ndim)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM