简体   繁体   English

用函数中的新数组填充 numpy 数组的最快方法

[英]Fastest way to fill numpy array with new arrays from function

I have a function f(a) that takes one entry from a testarray and returns an array with 5 values:我有一个函数f(a)testarray中获取一个条目并返回一个包含 5 个值的数组:

f(testarray[0])
#Output: array([[0, 1, 5, 3, 2]])

Since f(testarray[0]) is the result of an experiment, I want to run this function f for each entry of the testarray and store each result in a new NumPy array.由于f(testarray[0])是实验的结果,我想为testarray的每个条目运行这个函数f并将每个结果存储在一个新的 NumPy 数组中。 I always thought this would be quite simple by just taking an empty NumPy array with the length of the testarray and save the results the following way:我一直认为这会很简单,只需使用一个带有testarray长度的空 NumPy 数组并以下列方式保存结果:

N = 1000 #Number of entries of the testarray
test_result  = np.zeros([N, 5], dtype=int)

for i in testarray:
        test_result[i] = f(i)

When I run this, I don't receive any error message but nonsense results (half of the test_result is empty while the rest is filled with implausible values).当我运行它时,我没有收到任何错误消息,而是收到无意义的结果(一半的test_result是空的,而其余的则充满了难以置信的值)。 Since f() works perfectly for a single entry of the testarray I suppose that something of the way of how I save the results in the test_result is wrong.由于f()完美地适用于testarray的单个条目,我认为我将结果保存在test_result中的方式是错误的。 What am I missing here?我在这里想念什么?

(I know that I could save the results as list and then append an empty list, but this method is too slow for the large number of times I want to run the function). (我知道我可以将结果保存为列表,然后附加一个空列表,但是这种方法对于我想要运行该函数的大量时间来说太慢了)。

Since you don't seem to understand indexing, stick with this approach由于您似乎不了解索引,请坚持使用这种方法

alist = [f(i) for i in testarray]
arr = np.array(alist)

I could show how to use row indices and testarray values together, but that requires more explanation.我可以展示如何一起使用行索引和 testarray 值,但这需要更多解释。

Your problem may could be reproduced by the following small example:您的问题可能会通过以下小示例重现:

testarray = np.array([5, 6, 7, 3, 1])

def f(x):
    return np.array([x * i for i in np.arange(1, 6)])

f(testarray[0])
# [ 5 10 15 20 25]

test_result = np.zeros([len(testarray), 5], dtype=int)  # len(testarray) or testarray.shape[0]

So, as hpaulj mentioned in the comments , you must be careful how to use indexing:因此,正如hpaulj在评论中提到的,您必须小心如何使用索引:

for i in range(len(testarray)):
    test_result[i] = f(testarray[i])

# [[ 5 10 15 20 25]
#  [ 6 12 18 24 30]
#  [ 7 14 21 28 35]
#  [ 3  6  9 12 15]
#  [ 1  2  3  4  5]]

There will be another condition where the testarray is a specified index array that contains shuffle integers from 0 to N to full fill the zero array ie test_result .还有另一种情况,其中testarray是一个指定的索引数组,其中包含从0N的随机整数以完全填充零数组,即test_result For this condition we can create a reproducible example as:对于这种情况,我们可以创建一个可重现的示例:

testarray = np.array([4, 3, 0, 1, 2])

def f(x):
    return np.array([x * i for i in np.arange(1, 6)])

f(testarray[0])
# [ 4  8 12 16 20]

test_result = np.zeros([len(testarray), 5], dtype=int)

So, using your loop will get the following result:因此,使用您的循环将得到以下结果:

for i in testarray:
    test_result[i] = f(i)
# [[ 0  0  0  0  0]
#  [ 1  2  3  4  5]
#  [ 2  4  6  8 10]
#  [ 3  6  9 12 15]
#  [ 4  8 12 16 20]]

As it can be understand from this loop, if the index array be not from 0 to N , some rows in the zero array will left zero (unchanged):从这个循环可以理解,如果索引数组不是从0N ,零数组中的一些行将保持零(不变):

testarray = np.array([4, 2, 4, 1, 2])

for i in testarray:
    test_result[i] = f(i)
# [[ 0  0  0  0  0]   # <--
#  [ 1  2  3  4  5]
#  [ 2  4  6  8 10]
#  [ 0  0  0  0  0]   # <--
#  [ 4  8 12 16 20]]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM