[英]Accessing array elements by internal data index and order
This algorithmic problem is a bit too complex for me: 这个算法问题对我来说有点太复杂了:
I have an NumPy array of data which is internally indexed in a peculiar way. 我有一个NumPy数据数组,它以一种特殊的方式在内部建立索引。 The data is an output of two arrays spliced together (which I don't have), with distinct ordering. 数据是拼接在一起的两个数组的输出(我没有),它们的顺序不同。 I set the parameter max
to be a positive integer, and the data output has the index format 我将参数max
设置为正整数,并且数据输出具有索引格式
[ 00 10 20 ... max0 11 12 ... max1 22 23....max2 33....max max ]
The parameter max
determines the output of the array (ie the length of the array) and the ordering. 参数max
确定数组的输出(即数组的长度)和顺序。
For several examples, for max=2
, the data is of the order 对于几个示例,对于max=2
,数据的顺序为
[00 10 20 11 21 22]
Setting max=3
gives 设置max=3
给出
[00 10 20 30 11 21 31 22 32 33]
And max=4
is max=4
是
[00 10 20 30 40 11 21 31 41 22 32 42 33 43 44]
And so on. 等等。
I would like to write an algorithm to make a list/array of only the 3x
values, ie the values with first index 3. That is, I would only like to access certain data values, organized by the first index. 我想编写一种算法,使列表/数组仅包含3x
值,即具有第一个索引3的值。也就是说,我只想访问由第一个索引组织的某些数据值。
However, this is determined by the parameter max
. 但是,这由参数max
决定。 As you can see, this determines where the datum is placed by the array index. 如您所见,这确定了数组索引将基准放置在何处。 My only idea is to make some sort of sorting tree, but I am not sure how to execute that with this max
parameter. 我唯一的想法是制作某种排序树,但是我不确定如何使用此max
参数执行该排序树。
This list comprehension (or iteration) produces the indexs that you show 此列表理解(或迭代)产生您显示的索引
[[j*10+i for j in range(i,max+1)] for i in range(max+1)]
max=2: [[0, 10, 20], [11, 21], [22]]
max=3: [[0, 10, 20, 30], [11, 21, 31], [22, 32], [33]]
max=4: [[0, 10, 20, 30, 40], [11, 21, 31, 41], [22, 32, 42], [33, 43], [44]]
These lists of lists can be flattened. 这些列表列表可以展平。 But this arrangement may make it easier to think about the problem. 但是这种安排可以使思考问题变得更加容易。
Or is it more useful to generate tuples: 还是生成元组更有用:
In [134]: [[(j,i) for j in range(i,max+1)] for i in range(max+1)]
Out[134]:
[[(0, 0), (1, 0), (2, 0), (3, 0), (4, 0)],
[(1, 1), (2, 1), (3, 1), (4, 1)],
[(2, 2), (3, 2), (4, 2)],
[(3, 3), (4, 3)],
[(4, 4)]]
It isn't clear what you want to do with these numbers or indices, but here's an example of putting them in a 2d array: 尚不清楚您要如何处理这些数字或索引,但以下是将它们放入2d数组的示例:
In [150]: dlist=[[j*10+i for j in range(i,max+1)] for i in range(max+1)]
In [151]: ilist=[[(j,i) for j in range(i,max+1)] for i in range(max+1)]
In [152]: import itertools
In [155]: M=np.zeros((max+1,max+1),int)
In [157]: for (i,j),d in zip(itertools.chain(*ilist),itertools.chain(*dlist)):
M[i,j]=d
In [158]: M
Out[158]:
array([[ 0, 0, 0, 0, 0],
[10, 11, 0, 0, 0],
[20, 21, 22, 0, 0],
[30, 31, 32, 33, 0],
[40, 41, 42, 43, 44]])
The 1st 5 numbers go in the 1st column, 2nd 4 in the next, etc. 前5个数字在第一列中,第二个4在下一个列中,依此类推。
itertools.chain
is one way of flattening a list of lists. itertools.chain
是一种扁平化列表列表的方法。
The layout of M
looks like a lower-triangle. M
的布局看起来像一个下三角形。 There's a numpy
function to generate those indices: 有一个numpy
函数来生成这些索引:
In [176]: np.tril_indices(5)
Out[176]:
(array([0, 1, 1, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 4], dtype=int32),
array([0, 0, 1, 0, 1, 2, 0, 1, 2, 3, 0, 1, 2, 3, 4], dtype=int32))
So I could have filled M
with: 所以我可以用以下方法填充M
:
data = np.dot([10,1],np.tril_indices(5))
M[np.tril_indices(5)] = data
Digging into the code of tril_indices
I find that the starting point is a mask of 1s generated by: 深入研究tril_indices
的代码,我发现起点是由以下项生成的1的掩码:
I=((np.arange(max)-np.arange(max)[:,None])<0).astype(int)
array([[0, 0, 0, 0],
[1, 0, 0, 0],
[1, 1, 0, 0],
[1, 1, 1, 0]])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.