[英]Pandas Dataframes - new dataframe with reorganized data
I'm new to Pandas and trying to create a new dataframe from an existing one. 我是Pandas的新手,正在尝试从现有的数据库创建一个新的数据框架。
My current dataframe has a format: 我当前的数据框具有以下格式:
ID Country Status
ABC USA Go
ABC Columbia Stop
ABC Japan Pause
ABC Egypt Go
DEF Canada Go
DEF Peru Stop
I'm trying to consolidate the data to make it more compact. 我正在尝试合并数据以使其更紧凑。 My new format is: 我的新格式是:
ID Go Stop Pause
ABC USA, Egypt Columbia Japan
DEF Canada Peru
Basically, the possible Status values become the columns and, for each ID, these columns are populated with a list of countries having that status. 基本上,可能的状态值将成为列,对于每个ID,这些列将填充具有该状态的国家/地区的列表。 I'm new to pandas and struggling with the best way to approach this - any suggestions would be greatly appreciated. 我是熊猫的新手,正在努力寻求最好的方法-任何建议将不胜感激。
You can use pd.pivot_table
: 您可以使用pd.pivot_table
:
res = df.pivot_table(index='ID', columns='Status', values='Country', aggfunc=', '.join)
print(res)
Status Go Pause Stop
ID
ABC USA, Egypt Japan Columbia
DEF Canada None Peru
If you absolutely must do this then this is how you do it. 如果您绝对必须这样做,那么这就是您的做法。
In [48]: df.groupby(['ID', 'Status'])['Country'].apply(','.join).unstack()
Out[48]:
Status Go Pause Stop
ID
ABC USA,Egypt Japan Columbia
DEF Canada NaN Peru
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.