[英]Parse pandas column names to create multi-indexed dataframe
我有一個看起來像這樣的 DataFrame:
region 2008_indicatorA 2008_indicatorB ...(2009..2019)... 2020_indicatorA 2020_indicatorB
=============================================================================================
State1 ... ... ... ...
State2 ... ... ... ...
...
我需要從列中提取年份並制作單獨的列year
,同時減少列數。 生成的 DF 應如下所示:
region year indicatorA indicatorB
========================================
State1 2008 ... ...
State1 2009 ... ...
...
State1 (..2020) ... ...
...
State2 2008 ... ...
...
使用DataFrame.set_index
並MultiIndex in columns
中split
MultiIndex,然后使用DataFrame.rename_axis
並通過DataFrame.stack
重塑:
print (df)
region 2008_indicatorA 2008_indicatorB 2020_indicatorA 2020_indicatorB
0 State1 1 3 5 8
1 State2 7 5 3 9
df1 = df.set_index('region')
df1.columns = df1.columns.str.split('_', expand=True)
df1 = df1.rename_axis(['year',None], axis=1).stack(0).reset_index()
print (df1)
region year indicatorA indicatorB
0 State1 2008 1 3
1 State1 2020 5 8
2 State2 2008 7 5
3 State2 2020 3 9
對於MultiIndex DataFrame
刪除DataFrame.reset_index
:
df1 = df.set_index('region')
df1.columns = df1.columns.str.split('_', expand=True)
df1 = df1.rename_axis(['year',None], axis=1).stack(0)
print (df1)
indicatorA indicatorB
region year
State1 2008 1 3
2020 5 8
State2 2008 7 5
2020 3 9
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.