我正在尝试对 pandas 聚合进行排序

Question

I'm trying to aggregate and sort data from my dataset, but I don't know how to do.我正在尝试从我的数据集中汇总和排序数据，但我不知道该怎么做。 Can someone help me?有人能帮我吗？

data = {'message_id':  ['1', '1', '1', '1', '2', '2', '2'],
        'to': ['one', 'two', 'three', 'four', 'five', 'six', 'five'],
        'idt': ['1','2','3','4','5','6','5']
        }

df = pd.DataFrame(data, columns = ['message_id','to','idt'])

agg_func_text = {'to': [ set], 'idt': [ set]}

df.sort_values(by=['message_id', 'to'])

df3=df.groupby(['message_id']).agg(agg_func_text)

as result:结果：

message_id  to set              idt set
1       {four, three, one, two}     {2, 3, 1, 4}
2       {five, six}         {5, 6}

but I would like to recevied this as result:但我想收到这个结果：

message_id  to set              idt set
1       {one, two, three, four}     {1, 2, 3, 4}
2       {five, six}         {5, 6}

Answer 1

In python set is not defined order, so cannot sorting or change ordering there, possible soution is use dict.fromkeys().keys() trick for remove duplicates and output is tuple (which should be sorted and there is also defined order):在python set未定义顺序，因此无法在此处排序或更改顺序，可能的解决方案是使用dict.fromkeys().keys()技巧来删除重复项，并且 output 是tuple （应该排序并且还定义了顺序）：

f = lambda x: dict.fromkeys(x).keys()
agg_func_text = {'to': f, 'idt': f}

#if need sorting assign back
df = df.sort_values(by=['message_id', 'idt'])

df3=df.groupby('message_id').agg(agg_func_text)

print (df3)
                                 to           idt
message_id                                       
1           (one, two, three, four)  (1, 2, 3, 4)
2                       (five, six)        (5, 6)

Answer 2

sort using a number in a dictionary, save the results, then translate from the number back to a character representation of the number.使用字典中的数字进行排序，保存结果，然后从数字转换回数字的字符表示。

 data = {'message_id':  ['1', '1', '1', '1', '2', '2', '2'],
    'to': ['one', 'two', 'three', 'four', 'five', 'six', 'five'],
    'idt': ['1','2','3','4','5','6','5']
    }

 df = pd.DataFrame(data, columns = ['message_id','to','idt'])
 print(df)
 agg_func_text = {'to': set, 'idt': set}
 df.sort_values(by=['message_id', 'to'])

 grouped=df.groupby(['message_id']).agg(agg_func_text)

 grouped['idt']=grouped['idt'].apply(lambda x: sorted(x))
 dct={'one':1, 'two':2,'three':3,'four':4,'five':5,'six':6,'seven':7,'eight':8,'nine':9}
 dct2={1: 'one',2:'two',3:'three',4:'four',5:'five',6:'six',7:'seven',8:'eight',9:'nine'}
 grouped['to']=grouped['to'].apply(lambda x: sorted([dct[item] for item in x]))
 grouped['to']=grouped['to'].apply(lambda x: [dct2[item] for item in x])
 print(grouped)

 output:
                          to           idt
 message_id                                       
 1           [one, two, three, four]  [1, 2, 3, 4]
 2                       [five, six]        [5, 6]

我正在尝试对 pandas 聚合进行排序

问题描述

2 个解决方案

解决方案1
1 已采纳 2021-02-09 14:24:01

解决方案2
0 2021-02-09 19:15:44

我正在尝试对 pandas 聚合进行排序

问题描述

2 个解决方案

解决方案1 1 已采纳 2021-02-09 14:24:01

解决方案2 0 2021-02-09 19:15:44

解决方案1
1 已采纳 2021-02-09 14:24:01

解决方案2
0 2021-02-09 19:15:44