[英]Grouping by unique values while transposing column
I asked a similar question the other day with data from two columns: 前几天,我用来自两列的数据问了一个类似的问题:
Grouping columns by unique values in Python 在Python中按唯一值对列进行分组
Now I have three columns. 现在我有三列。 They need to be grouped by column A with column B as the header values and column C sorted properly. 它们需要按A列分组,其中B列作为标题值,C列正确排序。
My data frame looks like: 我的数据框如下所示:
A B C
25115 20 45
25115 30 154
25115 40 87
25115 70 21
25115 90 74
26200 10 48
26200 20 414
26200 40 21
26200 50 288
26200 80 174
26200 90 54
But I need to end up with this: 但是我需要结束这个:
10 20 30 40 50 70 80 90
25115 45 154 87 21 74
26200 48 414 21 288 174 54
This gets the values of column C, but not with column B as the row names. 这将获取列C的值,但不使用列B作为行名。
import pandas as pd
df = pd.DataFrame({'A':[25115,25115,25115,25115,25115,26200,26200,26200,26200,26200,26200],'B':[20,30,40,70,90,10,20,40,50,80,90],'C':[45,154,87,21,74,48,414,21,288,174,54]})
a = df.groupby('A')['C'].apply(lambda x:' '.join(x.astype(str)))
Any ideas would be most appreciated. 任何想法将不胜感激。
Use pivot_table: 使用数据透视表:
df.pivot_table(values='C',index='A',columns='B')
Output 产量
B 10 20 30 40 50 70 80 90
A
25115 NaN 45.0 154.0 87.0 NaN 21.0 NaN 74.0
26200 48.0 414.0 NaN 21.0 288.0 NaN 174.0 54.0
Use set_index / unstack: 使用set_index / unstack:
df.set_index(['A','B'])['C'].unstack()
Output: 输出:
B 10 20 30 40 50 70 80 90
A
25115 NaN 45.0 154.0 87.0 NaN 21.0 NaN 74.0
26200 48.0 414.0 NaN 21.0 288.0 NaN 174.0 54.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.