用函数中的新数组填充 numpy 数组的最快方法

Question

I have a function f(a) that takes one entry from a testarray and returns an array with 5 values:我有一个函数f(a)从testarray中获取一个条目并返回一个包含 5 个值的数组：

f(testarray[0])
#Output: array([[0, 1, 5, 3, 2]])

Since f(testarray[0]) is the result of an experiment, I want to run this function f for each entry of the testarray and store each result in a new NumPy array.由于f(testarray[0])是实验的结果，我想为testarray的每个条目运行这个函数f并将每个结果存储在一个新的 NumPy 数组中。 I always thought this would be quite simple by just taking an empty NumPy array with the length of the testarray and save the results the following way:我一直认为这会很简单，只需使用一个带有testarray长度的空 NumPy 数组并以下列方式保存结果：

N = 1000 #Number of entries of the testarray
test_result  = np.zeros([N, 5], dtype=int)

for i in testarray:
        test_result[i] = f(i)

When I run this, I don't receive any error message but nonsense results (half of the test_result is empty while the rest is filled with implausible values).当我运行它时，我没有收到任何错误消息，而是收到无意义的结果（一半的test_result是空的，而其余的则充满了难以置信的值）。 Since f() works perfectly for a single entry of the testarray I suppose that something of the way of how I save the results in the test_result is wrong.由于f()完美地适用于testarray的单个条目，我认为我将结果保存在test_result中的方式是错误的。 What am I missing here?我在这里想念什么？

(I know that I could save the results as list and then append an empty list, but this method is too slow for the large number of times I want to run the function). （我知道我可以将结果保存为列表，然后附加一个空列表，但是这种方法对于我想要运行该函数的大量时间来说太慢了）。

Answer 1

Since you don't seem to understand indexing, stick with this approach由于您似乎不了解索引，请坚持使用这种方法

alist = [f(i) for i in testarray]
arr = np.array(alist)

I could show how to use row indices and testarray values together, but that requires more explanation.我可以展示如何一起使用行索引和 testarray 值，但这需要更多解释。

Answer 2

Your problem may could be reproduced by the following small example:您的问题可能会通过以下小示例重现：

testarray = np.array([5, 6, 7, 3, 1])

def f(x):
    return np.array([x * i for i in np.arange(1, 6)])

f(testarray[0])
# [ 5 10 15 20 25]

test_result = np.zeros([len(testarray), 5], dtype=int)  # len(testarray) or testarray.shape[0]

So, as hpaulj mentioned in the comments , you must be careful how to use indexing:因此，正如hpaulj在评论中提到的，您必须小心如何使用索引：

for i in range(len(testarray)):
    test_result[i] = f(testarray[i])

# [[ 5 10 15 20 25]
#  [ 6 12 18 24 30]
#  [ 7 14 21 28 35]
#  [ 3  6  9 12 15]
#  [ 1  2  3  4  5]]

There will be another condition where the testarray is a specified index array that contains shuffle integers from 0 to N to full fill the zero array ie test_result .还有另一种情况，其中testarray是一个指定的索引数组，其中包含从0到N的随机整数以完全填充零数组，即test_result 。 For this condition we can create a reproducible example as:对于这种情况，我们可以创建一个可重现的示例：

testarray = np.array([4, 3, 0, 1, 2])

def f(x):
    return np.array([x * i for i in np.arange(1, 6)])

f(testarray[0])
# [ 4  8 12 16 20]

test_result = np.zeros([len(testarray), 5], dtype=int)

So, using your loop will get the following result:因此，使用您的循环将得到以下结果：

for i in testarray:
    test_result[i] = f(i)
# [[ 0  0  0  0  0]
#  [ 1  2  3  4  5]
#  [ 2  4  6  8 10]
#  [ 3  6  9 12 15]
#  [ 4  8 12 16 20]]

As it can be understand from this loop, if the index array be not from 0 to N , some rows in the zero array will left zero (unchanged):从这个循环可以理解，如果索引数组不是从0到N ，零数组中的一些行将保持零（不变）：

testarray = np.array([4, 2, 4, 1, 2])

for i in testarray:
    test_result[i] = f(i)
# [[ 0  0  0  0  0]   # <--
#  [ 1  2  3  4  5]
#  [ 2  4  6  8 10]
#  [ 0  0  0  0  0]   # <--
#  [ 4  8 12 16 20]]

用函数中的新数组填充 numpy 数组的最快方法

问题描述

2 个解决方案

解决方案1
0 2022-07-01 18:20:13

解决方案2
0 2022-07-01 22:01:05

用函数中的新数组填充 numpy 数组的最快方法

问题描述

2 个解决方案

解决方案1 0 2022-07-01 18:20:13

解决方案2 0 2022-07-01 22:01:05

解决方案1
0 2022-07-01 18:20:13

解决方案2
0 2022-07-01 22:01:05