[英]Python Numpy - Create 2d array where length is based on 1D array
Sorry for confusing title, but not sure how to make it more concise.抱歉标题混淆了,但不知道如何使它更简洁。 Here's my requirements:这是我的要求:
arr1 = np.array([3,5,9,1])
arr2 = ?(arr1)
arr2 would then be:然后 arr2 将是:
[
[0,1,2,0,0,0,0,0,0],
[0,1,2,3,4,0,0,0,0],
[0,1,2,3,4,5,6,7,8],
[0,0,0,0,0,0,0,0,0]
]
It doesn't need to vary based on the max, the shape is known in advance.它不需要根据最大值而变化,形状是预先知道的。 So to start I've been able to get a shape of zeros:所以开始我已经能够得到一个零的形状:
arr2 = np.zeros((len(arr1),max_len))
And then of course I could do a for loop over arr1 like this:然后当然我可以像这样在 arr1 上做一个 for 循环:
for i, element in enumerate(arr1):
arr2[i,0:element] = np.arange(element)
but that would likely take a long time and both dimensions here are rather large (arr1 is a few million rows, max_len is around 500).但这可能需要很长时间,而且这里的两个维度都相当大(arr1 是几百万行,max_len 大约是 500)。 Is there a clean optimized way to do this in numpy?在 numpy 中是否有一种干净优化的方法来执行此操作?
Building on a 'padding' idea posted by @Divakar some years ago:基于几年前@Divakar 发布的“填充”想法:
In [161]: res = np.arange(9)[None,:].repeat(4,0)
In [162]: res[res>=arr1[:,None]] = 0
In [163]: res
Out[163]:
array([[0, 1, 2, 0, 0, 0, 0, 0, 0],
[0, 1, 2, 3, 4, 0, 0, 0, 0],
[0, 1, 2, 3, 4, 5, 6, 7, 8],
[0, 0, 0, 0, 0, 0, 0, 0, 0]])
I am adding a slight variation on @hpaulj's answer because you mentioned that max_len
is around 500
and you have millions of rows.我在@hpaulj 的答案上添加了一些细微的变化,因为您提到max_len
大约为500
并且您有数百万行。 In this case, you can precompute a 500 by 500 matrix containing all possible rows and index into it using arr1
:在这种情况下,您可以预先计算一个 500 x 500 的矩阵,其中包含所有可能的行并使用arr1
对其进行索引:
import numpy as np
np.random.seed(0)
max_len = 500
arr = np.random.randint(0, max_len, size=10**5)
# generate all unique rows first, then index
# can be faster if max_len << len(arr)
# 53 ms
template = np.tril(np.arange(max_len)[None,:].repeat(max_len,0), k=-1)
res = template[arr,:]
# 173 ms
res1 = np.arange(max_len)[None,:].repeat(arr.size,0)
res1[res1>=arr[:,None]] = 0
assert (res == res1).all()
Try this -尝试这个 -
import numpy as np
import itertools
l = map(range, arr1)
arr2 = np.column_stack((itertools.zip_longest(*l, fillvalue=0)))
print(arr2)
array([[0, 1, 2, 0, 0, 0, 0, 0, 0],
[0, 1, 2, 3, 4, 0, 0, 0, 0],
[0, 1, 2, 3, 4, 5, 6, 7, 8],
[0, 0, 0, 0, 0, 0, 0, 0, 0]])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.