[英]pivoting pandas dataframe with all string data in each column
Sample dataframe:样品 dataframe:
Name Attribute Response
Joe A Yes
Joe B smoking
Joe B headache
Mary A Null
Mary B Never
Bob C Today
Mary A Tomorrow
I have tried for several hours and searching through all apparently similar SO questions to pivot this df to the below desired output.我已经尝试了几个小时,并搜索了与 pivot 这个 df 到下面所需的 output 的所有明显相似的 SO 问题。 Note, Joe and Mary have more than one row in which the Attribute is the same, but the response is different.请注意,Joe 和 Mary 有不止一行的 Attribute 相同,但响应不同。
Desired output所需 output
Name A B C
Joe Yes smoking, headache Null
Mary Null, tomorrow Never Null
Bob Null Null Today
Again, to reiterate, I have looked through every SO response regarding reshaping dataframes from long to wide and none of them involved this precise question.再次重申,我已经查看了所有关于将数据帧从长到宽重塑的 SO 响应,但没有一个涉及这个精确的问题。 Furthermore, each of those responses involved answers which I implemented and all resulted in errors, either a Value Error or Data Error, especially an error stating the index contained duplicate values.此外,这些响应中的每一个都涉及我实施的答案,并且都导致错误,无论是值错误还是数据错误,尤其是指出索引包含重复值的错误。 So, your help is appreciated.因此,感谢您的帮助。
You can do .pivot_table()
with aggfunc=list
:您可以使用aggfunc=list
执行.pivot_table()
:
print(
df.pivot_table(
index="Name", columns="Attribute", aggfunc=list, fill_value="Null"
).droplevel(0, axis=1)
)
Prints:印刷:
Attribute A B C
Name
Bob Null Null [Today]
Joe [Yes] [smoking, headache] Null
Mary [Null, Tomorrow] [Never] Null
Or if you don't want lists:或者,如果您不想要列表:
print(
df.pivot_table(
index="Name",
columns="Attribute",
aggfunc=",".join,
fill_value="Null",
).droplevel(0, axis=1)
)
Prints:印刷:
Attribute A B C
Name
Bob Null Null Today
Joe Yes smoking,headache Null
Mary Null,Tomorrow Never Null
EDIT: To rename indices:编辑:要重命名索引:
df = df.pivot_table(
index="Name",
columns="Attribute",
aggfunc=",".join,
fill_value="Null",
)
df.index.name = ""
df.columns.name = ""
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.