繁体   English   中英

使用 numpy.einsum 和矢量化的 Numpy martix 乘法

[英]Numpy martix multiplication using numpy.einsum with vectorization

我想执行图像的旋转。

开始和正常的形状是 (429, 1024, 3) 腐烂的形状是 (3, 3) 以下代码运行正常,但需要时间才能完成。

#rotation 30 degree
s = numpy.sin(numpy.pi * 30 / 180)
c = numpy.cos(numpy.pi * 30 / 180)

rot = [[1.0, 0.0, 0.0],
  [0.0,   c,   s],
  [0.0,  -s,   c]]


for i in range(height):
  for j in range(width):
    for arr in [start, norm]:
        x = arr[i,j,0]
        y = arr[i,j,1]
        z = arr[i,j,2]
        for d in range(3):
            arr[i,j,d] = rot[d][0] * x + rot[d][1] * y + rot[d][2] * z

我试图对代码进行矢量化,但有条件使用 numpy.einsum 来表示每个像素的矢量需要相乘。

#Moving 30 degree
s = numpy.sin(numpy.pi * 30 / 180)
c = numpy.cos(numpy.pi * 30 / 180)

rot = numpy.array([[1.0, 0.0, 0.0], [0.0,   c,   s], [0.0,  -s,   c]])
 
start[:,:,:3] = numpy.einsum('ij,j',rot[:3,0],start[:,:,0]) + 
 numpy.einsum('ij,j',rot[:3,1],start[:,:,1]) + numpy.einsum('ij,j',rot[:3,2],start[:,:,2])

norm[:,:,:3] = numpy.einsum('ij,j',rot[:3,0],norm[:,:,0]) + 
 numpy.einsum('ij,j',rot[:3,1],norm[:,:,1]) + numpy.einsum('ij,j',rot[:3,2],norm[:,:,2])

上面的代码给出了错误“爱因斯坦和下标字符串包含太多操作数 0 的下标”。

我应该在代码的矢量化形式中做哪些更改??

从我在你的代码中可以看出,正确的einsum签名应该是:

start = np.einsum('ijl, kl -> ijk', start, rot)
norm  = np.einsum('ijl, kl -> ijk',  norm, rot)

但是图片数组的最后一个维度是 RGB 颜色,而不是 XYZ 坐标(如评论中@QuongHoang 指出的那样),因此这不会像您想要的那样“旋转”图片。 你只是在旋转色彩空间

您可以使用下一个简单的代码:

a_ = np.empty_like(a)
for d in range(3):
    a_[:, :, d] = rot[d][0] * a[:, :, 0] + rot[d][1] * a[:, :, 1] + rot[d][2] * a[:, :, 2]
a = a_

完整使用示例如下:

在线试试吧!

import numpy as np

height, width = 100, 179
a = np.random.uniform(0., 1., (height, width, 3))
a0, a1 = np.copy(a), np.copy(a)

s = np.sin(np.pi * 30 / 180)
c = np.cos(np.pi * 30 / 180)

rot = [[1.0, 0.0, 0.0],
  [0.0,   c,   s],
  [0.0,  -s,   c]]

# -------- Version 1, non-vectorized --------------

for i in range(height):
  for j in range(width):
    for arr in [a0]:
        x = arr[i,j,0]
        y = arr[i,j,1]
        z = arr[i,j,2]
        for d in range(3):
            arr[i,j,d] = rot[d][0] * x + rot[d][1] * y + rot[d][2] * z

# -------- Version 2, vectorized --------------

a1_ = np.empty_like(a1)
for d in range(3):
    a1_[:, :, d] = rot[d][0] * a1[:, :, 0] + rot[d][1] * a1[:, :, 1] + rot[d][2] * a1[:, :, 2]
a1 = a1_

# ------ Check that we have same solution --------
assert np.allclose(a0, a1)

下面这两个解决方案的时间测量代码,矢量化解决方案似乎比非矢量化解决方案快48x倍,代码需要通过python -m pip install numpy timerit一次模块:

# Needs: python -m pip install numpy timerit

import numpy as np

s = np.sin(np.pi * 30 / 180)
c = np.cos(np.pi * 30 / 180)

rot = [[1.0, 0.0, 0.0],
  [0.0,   c,   s],
  [0.0,  -s,   c]]

# -------- Version 1, non-vectorized --------------

def f0(a):
    height, width = a.shape[:2]
    a = np.copy(a)
    for i in range(height):
      for j in range(width):
        for arr in [a]:
            x = arr[i,j,0]
            y = arr[i,j,1]
            z = arr[i,j,2]
            for d in range(3):
                arr[i,j,d] = rot[d][0] * x + rot[d][1] * y + rot[d][2] * z
    return a

# -------- Version 2, vectorized --------------

def f1(a):
    height, width = a.shape[:2]
    a_ = np.empty_like(a)
    for d in range(3):
        a_[:, :, d] = rot[d][0] * a[:, :, 0] + rot[d][1] * a[:, :, 1] + rot[d][2] * a[:, :, 2]
    a = a_
    return a

# ----------- Time/Speedup Measure ----------------

from timerit import Timerit
Timerit._default_asciimode = True

height, width = 100, 179
a = np.random.uniform(0., 1., (height, width, 3))

ra, rt = None, None
for f in [f0, f1]:
    print(f'{f.__name__}: ', end = '', flush = True)
    tim = Timerit(num = 15, verbose = 1)
    for t in tim:    
        ca = f(a)
    ct = tim.mean()
    if ra is None:
        ra, rt = ca, ct
    else:
        assert np.allclose(ra, ca)
        print(f'speedup {round(rt / ct, 2)}x')

输出:

f0: Timed best=144.785 ms, mean=146.205 +- 1.9 ms
f1: Timed best=2.781 ms, mean=3.022 +- 0.1 ms
speedup 48.38x

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM