简体   繁体   English

自定义conv2d操作Pytorch

[英]Custom conv2d operation Pytorch

I have tried a custom Conv2d function which has to work similar to nn.Conv2d but the multiplication and addition used inside nn.Conv2d are replaced with mymult(num1,num2) and myadd(num1,num2).我尝试了一个自定义的 Conv2d 函数,它的工作方式与 nn.Conv2d 类似,但在 nn.Conv2d 中使用的乘法和加法被 mymult(num1,num2) 和 myadd(num1,num2) 替换。

As per insight from very helpful forums 1, 2 what i can do is try unfolding it and then do matrix multiplication.按照从非常有用的论坛洞察力1, 2什么我能做的就是尽力展开它,然后做矩阵乘法。 That @ part given in the code below can be done using loops with mymult() and myadd() as i believe this @ is doing matmul.下面代码中给出的@ 部分可以使用带有 mymult() 和 myadd() 的循环来完成,因为我相信这个 @ 正在执行 matmul。

def convcheck():
    torch.manual_seed(123)
    batch_size = 2
    channels = 2

    h, w = 2, 2
    image = torch.randn(batch_size, channels, h, w) # input image
    out_channels = 3
    kh, kw = 1, 1# kernel size
    dh, dw = 1, 1 # stride
    size = int((h-kh+2*0)/dh+1)    #include padding in place of zero

    conv = nn.Conv2d(in_channels=channels, out_channels=out_channels, kernel_size=kw, padding=0,stride=dh ,bias=False)

    out = conv (image)
    #print('out', out)
    #print('out.size()', out.size())
    #print('')
    filt = conv.weight.data 


    imageunfold = F.unfold(image,kernel_size=kh,padding=0,stride=dh)

    print("Unfolded image","\n",imageunfold,"\n",imageunfold.shape)
    kernels_flat = filt.view(out_channels,-1)
    print("Kernel Flat=","\n",kernels_flat,"\n",kernels_flat.shape)
    res = kernels_flat @ imageunfold        # I have to replace this operation with mymult() and myadd()
    print(res,"\n",res.shape)
    #print(res.size(2),"\n",res.shape)
    res = res.view(-1, out_channels, size, size)
    #print("Same answer as buitlin function",res)

res = kernels_flat @ imageunfold can be replaced with this. res = kernels_flat @ imageunfold 可以用这个代替。 although there can be some other efficient implementation which i am looking to get help for.尽管可能还有其他一些有效的实施方式,我正在寻求帮助。

     for m_batch in range(len(imageunfold)):
        #iterate through rows of X   
        # iterate through columns of Y
        for j in range(imageunfold.size(2)):                   
            # iterate through rows of Y
            for k in range(imageunfold.size(1)):              
                #print(result[m_batch][i][j]," +=",   kernels_flat[i][k], "*", imageunfold[m_batch][k][j])
                result[m_batch][i][j] +=   kernels_flat[i][k] * imageunfold[m_batch][k][j]

Can someone please help me vectorize these three loops for faster execution.有人可以帮我矢量化这三个循环以加快执行速度。

The problem was with the dimesions as kernels_flat[dim0_1,dim1_1] and imageunfold[batch,dim0_2,dim1_2] the resultant should have [batch,dim0_1,dim1_2]问题在于尺寸为 kernels_flat[dim0_1,dim1_1] 和 imageunfold[batch,dim0_2,dim1_2] 结果应该有 [batch,dim0_1,dim1_2]

res = kernels_flat @ imageunfold can be replaced with this. res = kernels_flat @ imageunfold 可以用这个代替。 although there can be some other efficient implementation.虽然可以有一些其他有效的实现。

     for m_batch in range(len(imageunfold)):
            #iterate through rows of X  
            # iterate through columns of Y
            for j in range(imageunfold.size(2)):                   
                # iterate through rows of Y
                for k in range(imageunfold.size(1)):              
                    #print(result[m_batch][i][j]," +=",   kernels_flat[i][k], "*", imageunfold[m_batch][k][j])
                    result[m_batch][i][j] +=   kernels_flat[i][k] * imageunfold[m_batch][k][j]

Your code for the matrix multiplication is missing a loop for iterating over the filters.您的矩阵乘法代码缺少用于迭代过滤器的循环。 In the code below I fixed your implementation.在下面的代码中,我修复了您的实现。

I am currently also looking for optimizations on the code.我目前也在寻找对代码的优化。 In my use case, the individual results of the multiplications (without performing addition) need to be accessible after computation.在我的用例中,需要在计算后访问乘法的各个结果(不执行加法)。 I will post here in case I find a faster solution than this.如果我找到比这更快的解决方案,我会在这里发布。

for batch_image in range (imageunfold.shape[0]):
        for i in range (kernels_flat.shape[0]):
            for j in range (imageunfold.shape[2]):
                for k in range (kernels_flat.shape[1]):
                    res_c[batch_image][i][j] += kernels_flat[i][k] * imageunfold[batch_image][k][j]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM