[英]Pandas boolean value pivot on multiindex dataframe
嗨,我在基于值列旋转表时遇到问题。
假设我们有一个多索引 dataframe grade
:
索引为Country
、 Date
、 Group
和列Status
Status
Country Date Group
US 2019-12-31 Group A Absent
Group B Not Pass
Group C Absent
2020-01-02 Group A Pass
Group B Pass
Group C Pass
... ... ... ...
ID 2020-04-14 Group A Pass
Group B Pass
Group C Pass
2020-04-15 Group A Pass
Group B Pass
Group C Pass
我想解开列group
和Status
,并根据Status
列制作一个清单。
所以最后,我们得到了一个新的 dataframe checklist_grade
列Absent
, Not Pass
, Pass
为每个group
和相应的status
值列中的值v
。
为了便于理解我们想要的插图:
Status
Group A Group B Group C
Country Date Absent Not Pass Pass Absent Not Pass Pass Absent Not Pass Pass
US 2019-12-31 v v v
2020-01-02 v v v
... ... ... ... ... ... ... ... ... ... ...
ID 2020-04-14 v v v
2020-04-15 v v v
我正在尝试取消堆叠grade
dataframe 但它只会分解到group
:
Status
Group A Group B Group C
Country Date
US 2019-12-31 Absent Not Pass Absent
2020-01-02 Pass Pass Pass
... ... ... ... ...
ID 2020-04-14 Pass Pass Pass
2020-04-15 Pass Pass Pass
创建新列,将Status
转换为MultiIndex
并通过DataFrame.unstack
重塑:
df = (df.assign(New='v')
.set_index('Status', append=True)
.unstack([2,3])
.rename(columns={'New':'Status'}))
print (df)
Status
Group Group A Group B Group C Group A Group B Group C
Status Absent Not Pass Absent Pass Pass Pass
Country Date
ID 2020-04-14 NaN NaN NaN v v v
2020-04-15 NaN NaN NaN v v v
US 2019-12-31 v v v NaN NaN NaN
2020-01-02 NaN NaN NaN v v v
最后,如果需要, MultiIndex
中的所有组合级别添加DataFrame.reindex
和MultiIndex.from_product
:
df = df.reindex(pd.MultiIndex.from_product(df.columns.levels), axis=1)
print (df)
Status \
Group A Group B Group C
Absent Not Pass Pass Absent Not Pass Pass Absent
Country Date
ID 2020-04-14 NaN NaN v NaN NaN v NaN
2020-04-15 NaN NaN v NaN NaN v NaN
US 2019-12-31 v NaN NaN NaN v NaN v
2020-01-02 NaN NaN v NaN NaN v NaN
Not Pass Pass
Country Date
ID 2020-04-14 NaN v
2020-04-15 NaN v
US 2019-12-31 NaN NaN
2020-01-02 NaN v
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.