简体   繁体   English

numpy.tile非整数次

[英]numpy.tile a non-integer number of times

Is there a better way in numpy to tile an array a non-integer number of times? 是否有更好的方法在numpy中将数组平铺非整数次? This gets the job done, but is clunky and doesn't easily generalize to n-dimensions: 这可以完成工作,但很笨重,并且不容易推广到n维:

import numpy as np
arr = np.arange(6).reshape((2, 3))
desired_shape = (5, 8)
reps = tuple([x // y for x, y in zip(desired_shape, arr.shape)])
left = tuple([x % y for x, y in zip(desired_shape, arr.shape)])
tmp = np.tile(arr, reps)
tmp = np.r_[tmp, tmp[slice(left[0]), :]]
tmp = np.c_[tmp, tmp[:, slice(left[1])]]

this yields: 这会产生:

array([[0, 1, 2, 0, 1, 2, 0, 1],
       [3, 4, 5, 3, 4, 5, 3, 4],
       [0, 1, 2, 0, 1, 2, 0, 1],
       [3, 4, 5, 3, 4, 5, 3, 4],
       [0, 1, 2, 0, 1, 2, 0, 1]])

EDIT: Performance results 编辑:绩效结果

Some test of the three answers that were generalized to n-dimensions. 对三维答案的一些测试推广到n维。 These definitions were put in a file newtile.py : 这些定义放在一个文件newtile.py

import numpy as np

def tile_pad(a, dims):
    return np.pad(a, tuple((0, i) for i in (np.array(dims) - a.shape)),
                  mode='wrap')

def tile_meshgrid(a, dims):
    return a[np.meshgrid(*[np.arange(j) % k for j, k in zip(dims, a.shape)],
                         sparse=True, indexing='ij')]

def tile_rav_mult_idx(a, dims):
    return a.flat[np.ravel_multi_index(np.indices(dims), a.shape, mode='wrap')]

Here are the bash lines: 这是bash行:

python -m timeit -s 'import numpy as np' 'import newtile' 'newtile.tile_pad(np.arange(30).reshape(2, 3, 5), (3, 5, 7))'
python -m timeit -s 'import numpy as np' 'import newtile' 'newtile.tile_meshgrid(np.arange(30).reshape(2, 3, 5), (3, 5, 7))'
python -m timeit -s 'import numpy as np' 'import newtile' 'newtile.tile_rav_mult_idx(np.arange(30).reshape(2, 3, 5), (3, 5, 7))'

python -m timeit -s 'import numpy as np' 'import newtile' 'newtile.tile_pad(np.arange(2310).reshape(2, 3, 5, 7, 11), (13, 17, 19, 23, 29))'
python -m timeit -s 'import numpy as np' 'import newtile' 'newtile.tile_meshgrid(np.arange(2310).reshape(2, 3, 5, 7, 11), (13, 17, 19, 23, 29))'
python -m timeit -s 'import numpy as np' 'import newtile' 'newtile.tile_rav_mult_idx(np.arange(2310).reshape(2, 3, 5, 7, 11), (13, 17, 19, 23, 29))'

Here are the results with small arrays (2 x 3 x 5): 以下是小型阵列(2 x 3 x 5)的结果:

pad:               10000 loops, best of 3: 106 usec per loop
meshgrid:          10000 loops, best of 3: 56.4 usec per loop
ravel_multi_index: 10000 loops, best of 3: 50.2 usec per loop

Here are the results with larger arrays (2 x 3 x 5 x 7 x 11): 以下是较大阵列(2 x 3 x 5 x 7 x 11)的结果:

pad:               10 loops, best of 3: 25.2 msec per loop
meshgrid:          10 loops, best of 3: 300 msec per loop
ravel_multi_index: 10 loops, best of 3: 218 msec per loop

So the method using np.pad is probably the most performant choice. 所以使用np.pad的方法可能是最高性能的选择。

Here's a pretty concise method: 这是一个非常简洁的方法:

In [57]: a
Out[57]: 
array([[0, 1, 2],
       [3, 4, 5]])

In [58]: old = a.shape

In [59]: new = (5, 8)

In [60]: a[(np.arange(new[0]) % old[0])[:,None], np.arange(new[1]) % old[1]]
Out[60]: 
array([[0, 1, 2, 0, 1, 2, 0, 1],
       [3, 4, 5, 3, 4, 5, 3, 4],
       [0, 1, 2, 0, 1, 2, 0, 1],
       [3, 4, 5, 3, 4, 5, 3, 4],
       [0, 1, 2, 0, 1, 2, 0, 1]])

Here's an n-dimensional generalization: 这是一个n维泛化:

def rep_shape(a, shape):
    indices = np.meshgrid(*[np.arange(k) % j for j, k in zip(a.shape, shape)],
                          sparse=True, indexing='ij')
    return a[indices]

For example: 例如:

In [89]: a
Out[89]: 
array([[0, 1, 2],
       [3, 4, 5]])

In [90]: rep_shape(a, (5, 8))
Out[90]: 
array([[0, 1, 2, 0, 1, 2, 0, 1],
       [3, 4, 5, 3, 4, 5, 3, 4],
       [0, 1, 2, 0, 1, 2, 0, 1],
       [3, 4, 5, 3, 4, 5, 3, 4],
       [0, 1, 2, 0, 1, 2, 0, 1]])

In [91]: rep_shape(a, (4, 2))
Out[91]: 
array([[0, 1],
       [3, 4],
       [0, 1],
       [3, 4]])

In [92]: b = np.arange(24).reshape(2,3,4)

In [93]: b
Out[93]: 
array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]],

       [[12, 13, 14, 15],
        [16, 17, 18, 19],
        [20, 21, 22, 23]]])

In [94]: rep_shape(b, (3,4,5))
Out[94]: 
array([[[ 0,  1,  2,  3,  0],
        [ 4,  5,  6,  7,  4],
        [ 8,  9, 10, 11,  8],
        [ 0,  1,  2,  3,  0]],

       [[12, 13, 14, 15, 12],
        [16, 17, 18, 19, 16],
        [20, 21, 22, 23, 20],
        [12, 13, 14, 15, 12]],

       [[ 0,  1,  2,  3,  0],
        [ 4,  5,  6,  7,  4],
        [ 8,  9, 10, 11,  8],
        [ 0,  1,  2,  3,  0]]])

Here's how the first example works... 以下是第一个示例的工作原理......

The idea is to use arrays to index a . 我们的想法是使用数组来索引a Take a look at np.arange(new[0] % old[0]) : 看看np.arange(new[0] % old[0])

In [61]: np.arange(new[0]) % old[0]
Out[61]: array([0, 1, 0, 1, 0])

Each value in that array gives the row of a to use in the result. 该数组中的每个值都给出a要在结果中使用的a行。 Similary, 与之相似,

In [62]: np.arange(new[1]) % old[1]
Out[62]: array([0, 1, 2, 0, 1, 2, 0, 1])

gives the columns of a to use in the result. 给出要在结果中使用的a列。 For these index arrays to create a 2-d result, we have to reshape the first one into a column: 要使这些索引数组创建二维结果,我们必须将第一个重新整形为一列:

In [63]: (np.arange(new[0]) % old[0])[:,None]
Out[63]: 
array([[0],
       [1],
       [0],
       [1],
       [0]])

When arrays are used as indices, they broadcast . 当数组用作索引时,它们会广播 Here's what the broadcast indices look like: 这是广播索引的样子:

n [65]: i, j = np.broadcast_arrays((np.arange(new[0]) % old[0])[:,None], np.arange(new[1]) % old[1])

In [66]: i
Out[66]: 
array([[0, 0, 0, 0, 0, 0, 0, 0],
       [1, 1, 1, 1, 1, 1, 1, 1],
       [0, 0, 0, 0, 0, 0, 0, 0],
       [1, 1, 1, 1, 1, 1, 1, 1],
       [0, 0, 0, 0, 0, 0, 0, 0]])

In [67]: j
Out[67]: 
array([[0, 1, 2, 0, 1, 2, 0, 1],
       [0, 1, 2, 0, 1, 2, 0, 1],
       [0, 1, 2, 0, 1, 2, 0, 1],
       [0, 1, 2, 0, 1, 2, 0, 1],
       [0, 1, 2, 0, 1, 2, 0, 1]])

These are the index array that we need to generate the array with shape (5, 8): 这些是我们生成具有形状(5,8)的数组所需的索引数组:

In [68]: a[i,j]
Out[68]: 
array([[0, 1, 2, 0, 1, 2, 0, 1],
       [3, 4, 5, 3, 4, 5, 3, 4],
       [0, 1, 2, 0, 1, 2, 0, 1],
       [3, 4, 5, 3, 4, 5, 3, 4],
       [0, 1, 2, 0, 1, 2, 0, 1]])

When index arrays are given as in the example at the beginning (ie using (np.arange(new[0]) % old[0])[:,None] in the first index slot), numpy doesn't actually generate these index arrays in memory like I did with i and j . 当索引数组在开头的例子中给出时(即在第一个索引槽中使用(np.arange(new[0]) % old[0])[:,None] ,numpy实际上并不生成这些内存中的索引数组就像我用ij i and j show the effective contents when broadcasting occurs. ij表示广播发生时的有效内容。

The function rep_shape does the same thing, using np.meshgrid to generate the index arrays for each "slot" with the correct shapes for broadcasting. 函数rep_shape执行相同的操作,使用np.meshgrid为每个“slot”生成具有正确广播形状的索引数组。

Maybe not very efficient but very concise: 也许效率不高但非常简洁:

arr = np.arange(6).reshape((2, 3))
desired_shape = (5, 8)

arr.flat[np.ravel_multi_index(np.indices(desired_shape), arr.shape, mode='wrap')]

Another solution which is even more concise: 另一个更简洁的解决方案:

arr = np.arange(6).reshape((2, 3))
desired_shape = np.array((5, 8))

pads = tuple((0, i) for i in (desired_shape-arr.shape))
# pads = ((0, add_rows), (0, add_columns), ...)
np.pad(arr, pads, mode="wrap")

but it is slower for small arrays (much faster for large ones though). 但对于小型阵列来说速度较慢(尽管大型阵列要快得多)。 Strangely, np.pad won't accept np.array for pads. 奇怪的是,np.pad不会接受np.array作为打击垫。

Not sure for n dimensions, but you can consider using hstack and vstack . 不确定n维,但你可以考虑使用hstackvstack

arr = np.arange(6).reshape((2, 3))

nx, ny = shape(arr)
Nx, Ny = 5, 8 # These are the new shapes
iX, iY = Nx//nx+1, Ny//ny+1

result = vstack(tuple([ hstack(tuple([arr]*iX))[:, :Nx] ]*iY))[:Ny, :  ]

There is a dstack , but I doubt if that is going to help. 有一个dstack ,但我怀疑这是否会有所帮助。 Not entirely sure about 3 and higher dimentions. 不完全确定3和更高的尺寸。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM