简体   繁体   English

如何根据索引更有效地将一个数组中的值分配给另一个数组?

[英]How can I assign values from one array to another according to the index more efficiently?

I am trying to replace the values of one array with another according to how many ones are in the source array.我试图根据源数组中有多少个数组来替换一个数组的值。 I assign a value from a given index in the replacement array based on the sum.我根据总和从替换数组中的给定索引中分配一个值。 Thus, if there are 2 ones in a row, it assigns a value of l1[1] to the species, and if there is one unit, it assigns a value of l1[0] to the output.因此,如果连续有 2 个,则为物种分配值l1[1] ,如果有一个单元,则为输出分配值l1[0]

It will be better seen in a specific example:在一个具体的例子中会更好地看到:

import numpy as np

l1 = np.array([4, 5])
x112 = np.array([[0, 0], [0, 1], [1, 1], [0, 0], [1, 0], [1, 1]])

array([[0, 0],
       [1, 0],
       [1, 1],
       [0, 0],
       [1, 0],
       [1, 1]])

Required output:所需输出:

[[0]
 [4]
 [5]
 [0]
 [4]
 [5]]

I did this by counting the units in each row and assigning accordingly using np.where :我通过计算每行中的单位并使用np.where进行np.where分配来np.where

x1x2 = np.array([0, 1, 2, 0 1, 2]) #count value 1
x1x2 = np.where(x1x2 != 1, x1x2, l1[0]) 
x1x2 = np.where(x1x2 != 2, x1x2, l1[1])             
print(x1x2)

output输出

[0 4 5 0 4 5]

Could this be done more effectively?这可以更有效地完成吗?

Okay I actually gave devectorizing your code a shot.好吧,我实际上尝试了对您的代码进行去向量化。 First the vectorized NumPy you have:首先是您拥有的矢量化 NumPy:

def op(x112, l1):
    # bit of cheating, adding instead of counting 1s
    x1x2 = x112[:,0] + x112[:,1]

    x1x2=np.where(x1x2 != 1, x1x2, l1[0])
    x1x2=np.where(x1x2 != 2, x1x2, l1[1])
    return x1x2

The most efficient alternative is to loop through x112 only once, so let's do a Numba loop.最有效的替代方法是只循环一次x112 ,所以让我们做一个 Numba 循环。

import numba as nb

@nb.njit
def loop(x112, l1):
    d0, d1 = x112.shape
    x1x2 = np.zeros(d0, dtype = x112.dtype)
    for i in range(d0):
        # actually count the 1s
        num1s = 0
        for j in range(d1):
            if x112[i,j] == 1:
                num1s += 1
        
        if num1s == 1:
            x1x2[i] = l1[0]
        elif num1s == 2:
            x1x2[i] = l1[1]
    return x1x2

Numba loop has a ~9-10x speed improvement on my laptop. Numba 循环在我的笔记本电脑上有大约 9-10 倍的速度提升。

%timeit op(x112, l1)
8.05 µs ± 34.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

%timeit loop(x112, l1)
873 ns ± 5.09 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

As @Mad_Physicist requested, timings with a bigger array.正如@Mad_Physicist 所要求的那样,使用更大的数组进行计时。 I'm including his advanced-indexing method too.我也包括他的高级索引方法。

x112 = np.random.randint(0, 2, size = (100000, 2))
l1_v2 = np.array([0,4,5])

%timeit op(x112, l1)
1.35 ms ± 27.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit loop(x112, l1)
956 µs ± 2.78 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit l1_v2[x112.sum(1)]
1.2 ms ± 1.05 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

EDIT: Okay maybe take these timings with a grain of salt because when I went to restart the IPython kernel and reran this stuff, op(x112, l1) improved to 390 µs ± 22.1 µs per loop while the other methods retained the same performance (971 µs, 1.23 ms).编辑:好吧,也许对这些时间持保留op(x112, l1)因为当我重新启动 IPython 内核并重新运行这些东西时, op(x112, l1)提高到390 µs ± 22.1 µs per loop而其他方法保持相同的性能( 971 微秒,1.23 毫秒)。

You can use direct indexing:您可以使用直接索引:

l1 = np.array([0, 4, 5])
x112 = np.array([[0, 0], [0, 1], [1, 1], [0, 0], [1, 0], [1, 1]])

result = l1[x112.sum(1)]

This works if you're at liberty to prepend the zero to l1 at creation time.如果您可以在创建时l1l1前面加上零,则此方法有效。 If not:如果不:

result = np.r_[0, l1][x112.sum(1)]

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何将值从一个数组存储到另一个数组? - How can i store values from one array to another array? 如何根据另一个数组中的重复值在一个数组中添加值? - How to add values in one array according to repeated values in another array? 有效地为数组赋值 - Assign values to array efficiently 如何用另一个数组中唯一值的索引替换numpy数组中的重复值? - How can I replace recurring values in a numpy array by the index of the unique value from another array? 如何用另一个数组的相同索引中的值替换一个数组中的值? - How can I replace a value from one array with a value in the same index of another array? 如何在Python中根据另一个数组中的值提取一个数组中的数据 - How to extract data in one array according to the values in another array in Python 在Pandas中,如何使用一个表中的值作为索引从另一个表中提取数据? - In Pandas how can I use the values in one table as an index to extract data from another table? 对于 ant 模拟,如何更有效地存储和使用值? - How can i store and use values more efficiently for a ant simulation? 如何在不使用values或values_list的情况下从另一个QuerySet中有效地获取模型的QuerySet - How can I get efficiently QuerySet of a Model from another QuerySet without using values or values_list 如何将一列的前 10 个值分配给 pandas dataframe 中另一列的前 10 行? - How can I assign first 10 values from one column, to first 10 rows of another column in pandas dataframe?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM