简体   繁体   English

使用每列中的所有字符串数据旋转 pandas dataframe

[英]pivoting pandas dataframe with all string data in each column

Sample dataframe:样品 dataframe:

Name   Attribute     Response
Joe    A             Yes
Joe    B             smoking 
Joe    B             headache
Mary   A             Null
Mary   B             Never
Bob    C             Today
Mary   A             Tomorrow

I have tried for several hours and searching through all apparently similar SO questions to pivot this df to the below desired output.我已经尝试了几个小时,并搜索了与 pivot 这个 df 到下面所需的 output 的所有明显相似的 SO 问题。 Note, Joe and Mary have more than one row in which the Attribute is the same, but the response is different.请注意,Joe 和 Mary 有不止一行的 Attribute 相同,但响应不同。

Desired output所需 output

Name    A                    B                     C
Joe    Yes                   smoking, headache     Null
Mary   Null, tomorrow        Never                 Null
Bob    Null                  Null                  Today

Again, to reiterate, I have looked through every SO response regarding reshaping dataframes from long to wide and none of them involved this precise question.再次重申,我已经查看了所有关于将数据帧从长到宽重塑的 SO 响应,但没有一个涉及这个精确的问题。 Furthermore, each of those responses involved answers which I implemented and all resulted in errors, either a Value Error or Data Error, especially an error stating the index contained duplicate values.此外,这些响应中的每一个都涉及我实施的答案,并且都导致错误,无论是值错误还是数据错误,尤其是指出索引包含重复值的错误。 So, your help is appreciated.因此,感谢您的帮助。

You can do .pivot_table() with aggfunc=list :您可以使用aggfunc=list执行.pivot_table()

print(
    df.pivot_table(
        index="Name", columns="Attribute", aggfunc=list, fill_value="Null"
    ).droplevel(0, axis=1)
)

Prints:印刷:

Attribute                 A                    B        C
Name                                                     
Bob                    Null                 Null  [Today]
Joe                   [Yes]  [smoking, headache]     Null
Mary       [Null, Tomorrow]              [Never]     Null

Or if you don't want lists:或者,如果您不想要列表:

print(
    df.pivot_table(
        index="Name",
        columns="Attribute",
        aggfunc=",".join,
        fill_value="Null",
    ).droplevel(0, axis=1)
)

Prints:印刷:

Attribute              A                 B      C
Name                                             
Bob                 Null              Null  Today
Joe                  Yes  smoking,headache   Null
Mary       Null,Tomorrow             Never   Null

EDIT: To rename indices:编辑:要重命名索引:

df = df.pivot_table(
    index="Name",
    columns="Attribute",
    aggfunc=",".join,
    fill_value="Null",
)

df.index.name = ""
df.columns.name = ""

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM