简体   繁体   中英

How to add neighboring elements of 2D array in Python without having to use nested loops?

I want to add neighboring elements "3x3" of an array and create new array. When using nested loops, it takes time since this piece of code will be called thousands of times.

import tensorflow as tf
import numpy as np

rows=6
cols=8

array1 = np.random.randint(10, size=(rows, cols))
print(array1)

array2=np.zeros((rows-2,cols-2))

for i in range(rows-2):
    for j in range(cols-2):
        array2[i,j]=np.sum(array1[i:i+3,j:j+3])# print()
        
print("output\n",array2)


##My output
    [[9 4 9 6 1 4 9 0]
     [2 3 4 2 0 0 9 0]
     [2 8 9 7 6 9 4 8]
     [6 3 6 7 7 0 7 5]
     [2 1 4 1 7 6 9 9]
     [1 1 2 6 3 8 1 4]]
    output
     [[50. 52. 44. 35. 42. 43.]
     [43. 49. 48. 38. 42. 42.]
     [41. 46. 54. 50. 55. 57.]
     [26. 31. 43. 45. 48. 49.]]

With vectorization, this can be solved. However, I tried different techniques but never had luck with any such as reshaping then adding arrays, using only one loop with size rows or cols. note: in my project, the size of rows and cols can be very big. it is similar to 2D convolution with kernal of ones. the question is, is there anyway to implement this without using loops? or at least reduce it to have smaller time complexity "to take only rows or cols as size of loop". 用于说明目的的图像

A micro-benchmark for @trenixjetix's convolve2d solution against a numpy implementation of @n-ata's partial_sum . The second approach turns out to be much faster even for relatively small kernels.

import numpy as np
from scipy.signal import convolve2d

def partial_sum(array1, n):
    array1 = array1.cumsum(1).cumsum(0)
    res = array1.copy()
    res[:,n:] -= array1[:,:-n]
    res[n:] -= array1[:-n]
    res[n:,n:] += array1[:-n,:-n]
    return res[n-1:,n-1:]

rows = 500
cols = 500
n = 50   # relatively small n, since convolve2d becomes slow
np.random.seed(42)
array1 = np.random.randint(10, size=(rows, cols))

%timeit convolve2d(array1, np.ones((n,n), int), 'valid')
#1 loop, best of 5: 1.3 s per loop

%timeit partial_sum(array1, n)
#100 loops, best of 5: 3.36 ms per loop

Correctness of partial_sum is verified by comparing it with convolve2d results.

np.testing.assert_equal(partial_sum(array1, n), convolve2d(array1, np.ones((n,n), int), 'valid'))

Runtime complexity for growing n

运行时复杂度

Code used for this benchmark

import perfplot

perfplot.show(
    setup=lambda n: (np.random.randint(10, size=(500, 500)), n),
    kernels=[
        lambda a, n: partial_sum(a, n),
        lambda a, n: convolve2d(a, np.ones((n,n), int), 'valid')
    ],
    labels=['partial_sum','convolve2d'],
    n_range=[k for k in range(10,80,5)]
)

I got the solution that i find better for this cases. Use the function convolve2d from scipy.signal module I used the function np.ones and np.array to declare your input as a numpy array and used the convolve2d to apply your kernel to each an every part of your image. This is called Convolutional Filters and Kernels and it is used a lot in image processing with python.

In [50]: import numpy as np

In [55]: np.ones((3,3))
Out[55]: 
array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])

In [59]: input_matrix
Out[59]: 
array([[9, 4, 9, 6, 1, 4, 9, 0],
       [2, 3, 4, 2, 0, 0, 9, 0],
       [2, 8, 9, 7, 6, 9, 4, 8],
       [6, 3, 6, 7, 7, 0, 7, 5],
       [2, 1, 4, 1, 7, 6, 9, 9],
       [1, 1, 2, 6, 3, 8, 1, 4]])

In [60]: kernel
Out[60]: 
array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])

In [61]: from scipy.signal import convolve2d

In [63]: convolve2d(input_matrix, kernel, 'valid')
Out[63]: 
array([[50., 52., 44., 35., 42., 43.],
       [43., 49., 48., 38., 42., 42.],
       [41., 46., 54., 50., 55., 57.],
       [26., 31., 43., 45., 48., 49.]])

Also, as a matter of fact the speed of this is quite fast. As you can see, even in a 1000x1000 matrix it's fast enough.

In [68]: %timeit convolve2d(input_matrix, kernel, 'valid')
5.24 µs ± 21.2 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

In [69]: input = np.random.randint(10, size=(1000, 1000))

In [70]: %timeit convolve2d(input_matrix, kernel, 'valid')
41.6 ms ± 555 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

Here you should use partial sum. and you could find any point in 1 operation.

If you use it for small numbers. There is no a lot of performance difference. But If you check it with big numbers. You will see performance difference.

rows = count of row

cols = count of cols

innerrow = your aim row. *//in your example(3)*

innercol = your aim col. *//in your example(3)*

With your code:

O(rows x cols x innerrow x innercol)

O(2000 x 2000 x 200 x 2000)

But you could calculate it with:

O(rows x cols)

example: O(2000 x 2000)

import numpy as np

rows = 1000
cols = 2000

array1 = np.random.randint(10, size=(rows, cols))
array0 = array1
print(array1)

array2 = np.zeros((rows-100, cols-100))

for i in range(0, rows):
    for j in range(1, cols):
        array1[i][j] += array1[i][j-1]

for i in range(0, cols):
    for j in range(1, rows):
        array1[j][i] += array1[j-1][i]

for i in range(rows-100):
    for j in range(cols-100):
        sm = array1[i+100][j+100]
        if i-1 >= 0:
            sm -= array1[i-1][j+100]
        if j-1 >= 0:
            sm -= array1[i+100][j-1]
        if i-1 >= 0 and j-1 >= 0:
            sm += array0[i-1][j-1]
        array2[i, j] = sm

print("output\n", array2)

And here is output

[[1 9 1 9 2 0 3 8]
 [7 1 8 7 1 2 8 4]
 [7 1 5 4 8 3 9 0]
 [4 4 8 9 3 1 7 6]
 [2 5 9 9 3 6 7 2]
 [9 0 9 5 0 3 2 8]]
output
 [[40. 45. 45. 36. 36. 37.]
 [45. 47. 53. 38. 42. 40.]
 [45. 54. 58. 46. 47. 41.]
 [50. 58. 55. 39. 32. 42.]]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM