[英]Reshaping dataframe on one column in pandas
I have a dataframe in pandas as follows 我在熊猫中有一个数据框如下
ID_1 ID_2 Date Val
1234 1480 6/13/1970 10
1234 1480 7/8/1970 9
1234 1480 6/4/1970 8
1234 1480 4/1/1970 7
5567 1481 11/20/1970 25
5567 1481 5/25/1970 12
5567 1481 4/23/1970 9
8799 1482 12/23/1970 8
8799 1482 4/23/1970 7
8799 1482 9/26/1970 6
I want to convert this into another dataframe that has the following format 我想将其转换为具有以下格式的另一个数据框
ID_1 ID_2 Largest Event 2nd Largest Event 3rd Largest Event 4th Largest Event
1234 1480 6/13/1970 7/8/1970 6/4/1970 4/1/1970
5567 1481 11/20/1970 5/25/1970 4/23/1970 NaN
8799 1482 12/23/1970 4/23/1970 9/26/1970 NaN
This is a subset of a much larger dataframe where I want the 10 largest events. 这是一个更大的数据框的子集,在该数据框中我需要10个最大的事件。 The dates are already sorted in descending order of the
Val
column, so the sorting is not an issue. 日期已经按
Val
列的降序排序,因此排序不是问题。
Any ideas? 有任何想法吗?
You can using rank
with pivot_table
您可以将
rank
与pivot_table
一起pivot_table
df.Val=df.Val.rank(ascending=False).astype(int).astype(str)+' Largest Event'
df.pivot_table(index=['ID_1','ID_2'],columns='Val',values='Date',aggfunc='sum').reset_index()
Out[629]:
Val ID_1 ID_2 1 Largest Event 2 Largest Event 3 Largest Event
0 1234 1480 6/13/1970 7/8/1970 6/4/1970
Update 更新资料
df.Val=df.groupby(['ID_1','ID_2']).Val.rank(ascending=False).astype(int).astype(str)+' Largest Event'
df.pivot_table(index=['ID_1','ID_2'],columns='Val',values='Date',aggfunc='sum').reset_index()
Out[673]:
Val ID_1 ID_2 1 Largest Event 2 Largest Event 3 Largest Event \
0 1234 1480 6/13/1970 7/8/1970 6/4/1970
1 5567 1481 11/20/1970 5/25/1970 4/23/1970
2 8799 1482 12/23/1970 4/23/1970 9/26/1970
Val 4 Largest Event
0 4/1/1970
1 None
2 None
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.