[英]Python Pandas Pivot Of Two columns (ColumnName and Value)
我有一个熊猫 dataframe 包含两列以及一个默认索引。 第一列是预期的“列名”,第二列是该列所需的值。
name returnattribute
0 Customer Name Customer One Name
1 Customer Code CGLOSPA
2 Customer Name Customer Two Name
3 Customer Code COTHABA
4 Customer Name Customer Three Name
5 Customer Code CGLOADS
6 Customer Name Customer Four Name
7 Customer Code CAPRCANBRA
8 Customer Name Customer Five Name
9 Customer Code COTHAMO
我想对此进行修改,以便我有 5 行两列(“客户名称”和“客户代码”)而不是 10 行。 期望的结果如下:
Customer Code Customer Name
0 CGLOSPA Customer One Name
1 COTHABA Customer Two Name
2 CGLOADS Customer Three Name
3 CAPRCANBRA Customer Four Name
4 COTHAMO Customer Five Name
我尝试使用 pandas pivot function:
df.pivot(columns='name', values='returnattribute')
但这会导致十行仍然有备用空白:
Customer Code Customer Name
0 NaN Customer One Name
1 CGLOSPA NaN
2 NaN Customer Two Name
3 COTHABA NaN
4 NaN Customer Three Name
5 CGLOADS NaN
6 NaN Customer Four Name
7 CAPRCANBRA NaN
8 NaN Customer Five Name
9 COTHAMO NaN
如何我 pivot dataframe 只得到 5 行两列?
在df.pivot
中,未传递index
参数时默认使用df.index
。 因此,output。
index
: str 或 object 或 str 列表,可选
- 用于制作新框架索引的列。 如果
None
,使用现有索引。
获得所需的 output。 您必须创建一个新的索引列,如下所示。
df.assign(idx=df.index//2).pivot(index='idx', columns='name', values='returnattribute')
# name Customer Code Customer Name
# idx
# 0 CGLOSPA Customer One Name
# 1 COTHABA Customer Two Name
# 2 CGLOADS Customer Three Name
# 3 CAPRCANBRA Customer Four Name
# 4 COTHAMO Customer Five Name
因为每两行代表一个数据点。 你可以使用`reshape. 现在,构建所需的 dataframe。
reshaped = df['returnattribute'].to_numpy().reshape(-1, 2)
# array([['Customer One Name', 'CGLOSPA'],
# ['Customer Two Name', 'COTHABA'],
# ['Customer Three Name', 'CGLOADS'],
# ['Customer Four Name', 'CAPRCANBRA'],
# ['Customer Five Name', 'COTHAMO']], dtype=object)
col_names = pd.unique(df.name)
# array(['Customer Name', 'Customer Code'], dtype=object)
out = pd.DataFrame(reshaped, columns=col_names)
# Customer Name Customer Code
# 0 Customer One Name CGLOSPA
# 1 Customer Two Name COTHABA
# 2 Customer Three Name CGLOADS
# 3 Customer Four Name CAPRCANBRA
# 4 Customer Five Name COTHAMO
# we can reorder the columns using reindex.
您也可以直接将新索引传递给pivot_table
,使用aggfunc='first'
因为您有非数字数据:
df.pivot_table(index=df.index//2, columns='name',
values='returnattribute', aggfunc='first')
output:
name Customer Code Customer Name
0 CGLOSPA Customer One Name
1 COTHABA Customer Two Name
2 CGLOADS Customer Three Name
3 CAPRCANBRA Customer Four Name
4 COTHAMO Customer Five Name
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.