[英]Error when manipulating dataframe with columns of type string with Pandas Pivot Table
I have the dataframe:我有数据框:
And I would like to obtain using Pivot Table or an alternative function this result:我想使用数据透视表或替代函数获得此结果:
I am trying to transform the rows of the Custom Field column into Columns, with the Pivot Table function of Pandas, and I get an error:我正在尝试使用 Pandas 的数据透视表功能将自定义字段列的行转换为列,但出现错误:
import pandas as pd
data = {
"Custom Field": ["CF1", "CF2", "CF3"],
"id": ["RSA", "RSB", "RSC"],
"Name": ["Wilson", "Junior", "Otavio"]
}
### create the dataframe ###
df = pd.DataFrame(data)
print(df)
df2 = df.pivot_table(columns=['Custom Field'], index=['Name'])
print(df2)
I suspect it is because I am working with Strings.我怀疑这是因为我正在使用字符串。
Any suggestions?有什么建议么?
Thanks in advance.提前致谢。
You need pivot
, not pivot_table
.您需要
pivot
,而不是pivot_table
。 The latter does aggregation on possibly repeating values whereas the former is just a rearrangement of the values and fails for duplicate values.后者对可能重复的值进行聚合,而前者只是对值的重新排列并且对于重复值失败。
df.pivot(columns=['Custom Field'], index=['Name'])
Update as per comment: if there are multiple values per cell, you need to use privot_table
and specify an appropriate aggregate function, eg concatenate the string values.根据评论更新:如果每个单元格有多个值,则需要使用
privot_table
并指定适当的聚合函数,例如连接字符串值。 You can also specify a fill value for empty cells (instead of NaN
):您还可以为空单元格指定填充值(而不是
NaN
):
df = pd.DataFrame({"Custom Field": ["CF1", "CF2", "CF3", "CF1"],
"id": ["RSA", "RSB", "RSC", "RSD"],
"Name": ["Wilson", "Junior", "Otavio", "Wilson"]})
df.pivot_table(columns=['Custom Field'], index=['Name'], aggfunc=','.join, fill_value='-')
id
Custom Field CF1 CF2 CF3
Name
Junior - RSB -
Otavio - - RSC
Wilson RSA,RSD - -
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.