[英]Manipulate Dataframe
假设我正在处理一个数据集:# dummy dataset
import pandas as pd
data = pd.DataFrame({"Name_id" : ["John","Deep","Julia","John","Sandy",'Deep'],
"Month_id" : ["December","March","May","April","May","July"],
"Colour_id" : ["Red",'Purple','Green','Black','Yellow','Orange']})
data
我怎样才能将这个数据框转换成这样的东西:
A_id 是唯一的,并且 forms 新列基于值和其他列的存在/不存在按出现顺序排列? 我曾尝试使用 pivot 但我注意到它更多地用于数值数据而不是分类数据。
也许你应该试试pivot
data['Rowid'] = data.groupby('Name_id').cumcount()+1
d = data.pivot(index='Name_id', columns='Rowid',values = ['Month_id','Colour_id'])
d.reset_index(inplace=True)
d.columns = ['Name_id','Month_id1', 'Colour_id1', 'Month_id2', 'Colour_id2']
这使
Name_id Month_id1 Colour_id1 Month_id2 Colour_id2
0 Deep March July Purple Orange
1 John December April Red Black
2 Julia May NaN Green NaN
3 Sandy May NaN Yellow NaN
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.