简体   繁体   English

根据第一列将成对的数组转换为2D数组

[英]converting an array of pairs into a 2D array based on first column

is there a (preferably elegant) way in Python for taking an array of pairs such as Python中是否有一种(最好是优雅的)方式来获取诸如

[[3,350],[4,800],[0,150],[0,200],[4,750]]

into something like 变成像

[
  [150,200],
  [],
  [],
  [350],
  [800,750]
]

?

In other words, what's a good method for putting the second number in every pair into an array, with its row index being determined by the first number in the pair? 换句话说,有什么好方法可以将每对中的第二个数字放入一个数组,其行索引由该对中的第一个数字确定?

Try taking a look at list comprehensions, they provide a one-liner way of creating lists. 尝试看一下列表理解,它们提供了一种创建列表的单线方式。 If you don't know what they are this is a pretty decent guide to get you started here . 如果你不知道它们是什么,这是一个相当不错的指南,让你开始在这里 Also, take a look at tuple 's, as they are more appropriate for paired values, as opposed to lists. 另外,请查看tuple ,因为它们更适合成对的值,而不是列表。 Note that tuples are not mutable, so you cannot make changes once you have created it. 请注意,元组是不可变的,因此创建后就无法进行更改。

Your list using tuples would look like this 您使用元组的列表如下所示

foo = [(3,350),(4,800),(0,200),(4,750)]

As far as I'm aware, Python lists have no predefined size, rather they grow and shrink as changes are made. 据我所知,Python列表没有预定义的大小,而是随着更改而增长和收缩。 So, what you'll want to do, is find the largest index value in the list, or foo = [x[0] for x in list_of_pairs] would access the first index of every list inside of your main list, the one named list_of_pairs . 因此,您要做的就是找到列表中最大的索引值,或者foo = [x[0] for x in list_of_pairs]将访问主列表中每个列表的第一个索引,即名为list_of_pairs Note that this strategy would work for the tuple based list as well. 注意,该策略也适用于基于tuple的列表。

The below should do what you want 下面应该做你想做的

list_of_pairs = [[3,350],[4,800],[0,200],[4,750]]
indexes = {x[0] for x in list_of_pairs}
new_list = []

for i in indexes:
    new_list.append([x[1] for x in list_of_pairs if x[0] == i])

As @thefourtheye noted dict might be better container. 正如@thefourtheye所指出的, dict可能是更好的容器。 In case you want 2D list you could first add the values a intermediate dict where key is row and value is list of numbers. 如果您想要2D列表,则可以首先将值添加到中间dict ,其中key是row,value是数字列表。 Then you could use list comprehension to generate the final result: 然后,您可以使用列表推导生成最终结果:

>>> l = [[3,350],[4,800],[0,150],[0,200],[4,750]]
>>> d = {}
>>> for row, num in l:
...     d.setdefault(row, []).append(num)
...
>>> [d.get(i, []) for i in range(max(d.keys()) + 1)]
[[150, 200], [], [], [350], [800, 750]] 

I would use pandas module for this task: 我将使用pandas模块执行此任务:

In [186]: a = np.array([[3,350],[4,800],[0,150],[0,200],[4,750]])

In [187]: res = pd.DataFrame(a).groupby(0)[1].apply(list).to_frame('val').rename_axis('idx')

In [188]: res
Out[188]:
            val
idx
0    [150, 200]
3         [350]
4    [800, 750]

Now you have an indexed data set and you can use it in the following way: 现在,您有了索引数据集,可以按以下方式使用它:

In [190]: res.ix[0, 'val']
Out[190]: [150, 200]

In [191]: res.ix[0, 'val'][1]
Out[191]: 200

In [192]: res.ix[4, 'val']
Out[192]: [800, 750]

PS i think you don't have to keep empty lists in the resulting data set - as it's a waste of resources 附言:我认为您不必在结果数据集中保留空列表-因为这浪费资源

There are numerious ways to do this. 有很多方法可以做到这一点。 Here's a fairly straight-forward one: 这是一个相当简单的方法:

a = [[3, 350], [4, 800], [0, 150], [0, 200], [4, 750]]

rows, values = zip(*a)
b = [[] for _ in range(max(rows)+1)]  # initialize 2D output
for i, row in enumerate(rows):
    b[row].append(values[i])

print(b)  # -> [[150, 200], [], [], [350], [800, 750]]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM