Pandas pivot_table - 如何从列值和列名的混合中创建 MultiIndex？

Question

I'm relatively new to Pandas.我对 Pandas 比较陌生。 I have a DataFrame in the form:我有一个 DataFrame 的形式：

         A         B       C            D         E
0        1       1.1       a      23.7853   18.2647
1        1       1.2       a      23.7118   17.2387
2        1       1.1       b      24.1873   17.3874
3        1       1.2       b      23.1873   18.1748
4        2       1.1       a      24.1872   18.1847
...      ...     ...       ...     ...       ...

I would like to pivot it to have a three-level MultiIndex constructed from the values in columns A and B and the column headers ["D", "E"].我希望 pivot 它具有由 A 和 B 列中的值以及列标题 [“D”、“E”] 构建的三级 MultiIndex。 I also want to use the values from B as the new column headers and the data in columns D and E for the values.我还想将 B 中的值用作新的列标题，并将 D 和 E 列中的数据用作值。 All values are one-to-one (with some NaNs).所有值都是一对一的（带有一些 NaN）。 If I understand correctly, I need to use pivot_table() instead of just pivot() because of the MultiIndex.如果我理解正确，由于 MultiIndex，我需要使用 pivot_table() 而不是 pivot()。 Ultimately I want a table that looks like:最终我想要一个看起来像这样的表：

B                      1.1       1.2  ...
A    C  col-name
1    a         D   23.7853   23.7118  ...
               E   18.2647   17.2387  ...
     b         D   24.1873   23.1873  ...
               E   17.3874   18.1748  ...
2    a         D   24.1872   23.1987  ...
               E   18.1847   19.2387  ...
...  ...     ...     ...       ...    ...

I'm pretty sure the answer is to use some command like我很确定答案是使用一些命令，例如

pd.pivot_table(df, columns=["B"], values=["D","E"], index=["A","C","???"])

I'm unsure what to put in the "values" and "index" arguments to get the right behavior.我不确定在“值”和“索引”arguments 中放入什么以获得正确的行为。

If I can't do this with a single pivot_table command, do I need to construct my Multi-Index ahead of time?如果我不能用一个 pivot_table 命令做到这一点，我是否需要提前构建我的多索引？ Then what?然后呢？

Thanks!谢谢！

Answer 1

Create a multiindex with columns A, C, B then use stack + unstack to reshape the dataframe使用列A, C, B创建多索引，然后使用stack + unstack重塑 dataframe

df.set_index(['A', 'C', 'B']).stack().unstack(-2)

B          1.1      1.2
A C                    
1 a D  23.7853  23.7118
    E  18.2647  17.2387
  b D  24.1873  23.1873
    E  17.3874  18.1748
2 a D  24.1872      NaN
    E  18.1847      NaN

Answer 2

You can use pd.pivot_table() together with .stack() , as follows:您可以将pd.pivot_table()与.stack() () 一起使用，如下所示：

(pd.pivot_table(df, index=['A', 'C'], columns='B', values=["D","E"])
   .rename_axis(columns=['col_name', 'B'])         # set axis name for ["D","E"] 
   .stack(level=0)
)

Result:结果：

B                 1.1      1.2
A C col_name                  
1 a D         23.7853  23.7118
    E         18.2647  17.2387
  b D         24.1873  23.1873
    E         17.3874  18.1748
2 a D         24.1872      NaN
    E         18.1847      NaN

Pandas pivot_table - 如何从列值和列名的混合中创建 MultiIndex？

问题描述

2 个解决方案

解决方案1
0 2021-11-18 18:04:59

解决方案2
0 2021-11-18 18:34:09

Pandas pivot_table - 如何从列值和列名的混合中创建 MultiIndex？

问题描述

2 个解决方案

解决方案1 0 2021-11-18 18:04:59

解决方案2 0 2021-11-18 18:34:09

解决方案1
0 2021-11-18 18:04:59

解决方案2
0 2021-11-18 18:34:09