简体   繁体   English

为每个唯一值创建列表

[英]Create list for each unique value

I'm currently looking at a table with the following structure. 我目前正在查看具有以下结构的表。

uid | action
 1  |   A1
 1  |   A1
 1  |   A1
 1  |   A4
 2  |   A1
 2  |   A8
 2  |   A9
 3  |   A3
 3  |   A7

I'm trying to create a multidimensional array with the following structure. 我正在尝试使用以下结构创建多维数组。

[[A1, A1, A1, A4], [A1, A8, A9], [A3, A7]] 

My idea is to keep track of a uid and append the actions to a list till the uid key changes. 我的想法是跟踪uid ,并将操作附加到列表中,直到uid键更改为止。 Once the uid key does change, all the actions will be appended to another array and the tracked uid will change to the new uid . 一旦uid键确实更改,所有操作将被追加到另一个数组,并且被跟踪的uid将更改为新的uid

I've come up with a somewhat overblown and incorrect solution using itertools.groupby() but I'm not satisfied with it and am looking for something simpler. 我已经使用itertools.groupby()提出了一个过于夸张和不正确的解决方案,但我对此并不满意,并且正在寻找更简单的方法。 However, I've overthought this problem and am coming up with more complicated solutions. 但是,我已经解决了这个问题,并提出了更复杂的解决方案。

Any tips would be appreciated. 任何提示将不胜感激。

Code: 码:

data = []
for i, j in itertools.groupby(table, key=lambda x: x['uid']):
    event_array = []
    for k in list(j):
        event_array.append(k['action'])
    data.append([i, event_array])

As per OP's comment , 根据OP的评论

@Black Are you sure that the data is ordered? @Black您确定数据已订购吗?

... @thefourtheye, yes pretty sure as I've had to write it in sql before reading it into python ... @thefourtheye,是的,可以肯定的是,在将其读入python之前,我必须先在sql中编写它

Since the data is already ordered, for example, like this 例如,由于数据已经排序

>>> data = [{'action': 'A1', 'uid': 1},
...  {'action': 'A1', 'uid': 1},
...  {'action': 'A1', 'uid': 1},
...  {'action': 'A4', 'uid': 1},
...  {'action': 'A1', 'uid': 2},
...  {'action': 'A8', 'uid': 2},
...  {'action': 'A9', 'uid': 2},
...  {'action': 'A3', 'uid': 3},
...  {'action': 'A7', 'uid': 3}]

you can simply use groupby itself, with a nested list comprehension, like this 您可以像使用嵌套列表一样简单地使用groupby本身

>>> [[k['action'] for k in j] for i, j in groupby(data, key=lambda x: x['uid'])]
[['A1', 'A1', 'A1', 'A4'], ['A1', 'A8', 'A9'], ['A3', 'A7']]

You can use good old defaultdict : 您可以使用旧的defaultdict

from collections import defaultdict

DATA = [{'uid': uid, 'action': action}
        for uid, action in [(1, 'A1'),
                            (1, 'A1'),
                            (1, 'A1'),
                            (1, 'A4'),
                            (2, 'A1'),
                            (2, 'A8'),
                            (2, 'A9'),
                            (3, 'A3'),
                            (3, 'A7'),]]

d = defaultdict(list)

for data in DATA:
    d[data['uid']].append(data['action'])

print(d.values())

Result will be: 结果将是:

[['A1', 'A1', 'A1', 'A4'], ['A1', 'A8', 'A9'], ['A3', 'A7']]

This should work, but it seems like groupby is already perfectly good. 这应该可以工作,但是groupby看起来已经非常不错了。

uids = {}
for row in table:
    uids.setdefault(row['uid'], []).append(row['action'])

data = [uids[uid] for uid in sorted(uids.keys())]

The solution simply iterates over each row in the table , and makes sure that there is a list for the corresponding uid in the uids dict (using setdefault ). 该解决方案只是简单地遍历table每一行,并确保在uids dict中存在对应uid的列表(使用setdefault )。 Then it appends the action for that row onto the list. 然后,它将针对该行的操作附加到列表中。

So uids will be a dictionary whose keys are the UIDs, and values are sequences of corresponding actions from the table. 因此, uids将是一个字典,其键是UID,值是表中相应动作的序列。

If you really want a list of lists (a "multidimensional array"), the last line uses a list comprehension to build a list whose elements are the lists of actions stored in the uids dict, ordered by uid. 如果你真的想要一个列表的列表(一个“多维数组”),最后一行使用列表理解来构建其元素存储在行动的清单列表uids字典,通过UID排序。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM