简体   繁体   中英

Numpy way to generate linear operation matrix from a convolution kernel

A 2D convolution kernel, K , of shape (k1, k2, n_channel, n_filter) applies on a 2D vector, A , of shape (m1, m2, n_channel) and generates another 2D vector, B , of shape (m1 - k1 + 1, m2 - k2 + 1, n_filter) (with valid padding).

It is also true that for each K , there exists a W_K of shape (m1 - k1 + 1, m2 - k2 + 1, n_filter, m1, m2, n_channel) , such that tensor dot of W_K and A is equal to B . ie B = np.tensordot(W_K, A, 3) .

I am trying to find a pure NumPy solution to generate this W_K from K without using any python loops.

I can see W_K[i,j,f] == np.pad(K[...,f], ((i,m1-i-k1), (j,m2-j-k2)), 'constant', constant_values=0) or simply W_K[i, j, f, i:i+k1, j:j+k2, ...] == K[..., f] .

What I'm looking for is almost similar to a Toeplitz matrix. But I need it in multi-dimensions.

Example in loopy code:

import numpy as np

# 5x5 image with 3-channels
A = np.random.random((5,5,3))
# 2x2 Conv2D kernel with 2 filters for A  
K = np.random.random((2,2,3,2))

# It should be of (4,4,2,5,5,3), but I create this way for convenience. I move the axis at the end.
W_K = np.empty((4,4,5,5,3,2))
for i, j in np.ndindex(4, 4):
  W_K[i, j] = np.pad(K, ((i, 5-i-2),(j, 5-j-2), (0, 0), (0, 0)), 'constant', constant_values=0)

# above lines can also be rewritten as
W_K = np.zeros((4,4,5,5,3,2))
for i, j in np.ndindex(4, 4):
  W_K[i, j, i:i+2, j:j+2, ...] = K[...]

W_K = np.moveaxis(W_K, -1, 2)

# now I can do
B = np.tensordot(W_K, A, 3)

What you want needs a bit of fancy indexing gymnastics but it's not very cumbersome to code. The idea is to create 4-dimensional index arrays that apply the W_K[i, j, i:i+2, j:j+2, ...] part of your second loopy example.

Here's a slightly modified version of your example, just to make sure that some relevant dimensions differ (because this makes bugs easier to find: they would be proper errors rather than mangled values):

import numpy as np

# parameter setup
k1, k2, nch, nf = 2, 4, 3, 2 
m1, m2 = 5, 6 
w1, w2 = m1 - k1 + 1, m2 - k2 + 1 
K = np.random.random((k1, k2, nch, nf)) 
A = np.random.random((m1, m2, nch)) 

# your loopy version for comparison
W_K = np.zeros((w1, w2, nf, m1, m2, nch)) 
for i, j in np.ndindex(w1, w2): 
    W_K[i, j, :, i:i+k1, j:j+k2, ...] = K.transpose(-1, 0, 1, 2) 

W_K2 = np.zeros((w1, w2, m1, m2, nch, nf))  # to be transposed back
i,j = np.mgrid[:w1, :w2][..., None, None]  # shape (w1, w2, 1, 1) 
k,l = np.mgrid[:k1, :k2]  # shape (k1, k2) ~ (1, 1, k1, k2)  

W_K2[i, j, i+k, j+l, ...] = K 
W_K2 = np.moveaxis(W_K2, -1, 2) 

print(np.array_equal(W_K, W_K2))  # True

We first create an index mesh i,j that span the first two dimensions of W_K , then create two similar meshes that span its (pre- moveaxis ) second and third dimensions. By injecting two trailing singleton dimensions into the former we end up with 4d index arrays that together span the first four dimensions of W_K .

All that's left is to assign to this slice using the original K , and move back the dimension. Due to how advanced indexing changes behaviour when the sliced (non-advanced) indices in an expression are not all next to one another, this is much easier to do with your moveaxis approach. I first tried to create W_K2 with its final dimensions, but then we'd have W_K[i, j, :, i+k, j+l, ...] that has subtly different behaviour (in particular, different shape).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM