Pandas：重塑dataframe，在一列和一个变量中拆分列

Question

I have the following dataframe that I am trying to melt:我有以下 dataframe 我正在尝试融化：

import numpy as np
import pandas as pd
dates = pd.date_range('1/1/2014', periods=4)
df = pd.DataFrame(np.eye(4, ), index=dates, columns=['A_var1', 'A_var2', 'B_var1', 'B_var2'])
print(df)

             A_var1  A_var2  B_var1  B_var2
2014-01-01     1.0     0.0     0.0     0.0
2014-01-02     0.0     1.0     0.0     0.0
2014-01-03     0.0     0.0     1.0     0.0
2014-01-04     0.0     0.0     0.0     1.0

I want to obtain the following:我想获得以下内容：

            type    var1    var2  
2014-01-01   A      1.0     0.0    
2014-01-01   B      0.0     0.0    
2014-01-02   A      0.0     1.0     
2014-01-02   B      0.0     0.0  
2014-01-03   A      0.0     0.0    
2014-01-03   B      1.0     0.0
2014-01-04   A      0.0     0.0     
2014-01-04   B      0.0     1.0

Any idea on how to do that efficiently?关于如何有效地做到这一点的任何想法？ I know I can use the melt function but I can't get it to work in that context.我知道我可以使用 melt function 但我无法让它在那种情况下工作。

Many thanks,非常感谢，

Answer 1

You could use stack on multi-indexed columns. 您可以在多索引列上使用stack 。

In [304]: df.columns = df.columns.str.split('_', expand=True)

In [305]: df.stack(0).reset_index(1)
Out[305]:
           level_1  var1  var2
2014-01-01       A   1.0   0.0
2014-01-01       B   0.0   0.0
2014-01-02       A   0.0   1.0
2014-01-02       B   0.0   0.0
2014-01-03       A   0.0   0.0
2014-01-03       B   1.0   0.0
2014-01-04       A   0.0   0.0
2014-01-04       B   0.0   1.0

Answer 2

One option is with the pivot_longer function from pyjanitor , using the .value placeholder:一种选择是使用pyjanitor中的 pivot_longer function ，使用.value占位符：

# pip install pyjanitor
import pandas as pd
import janitor

df.pivot_longer(names_to=("type", ".value"), 
                names_sep="_", 
                ignore_index=False, 
                sort_by_appearance = True)

          type  var1  var2
2014-01-01   A   1.0   0.0
2014-01-01   B   0.0   0.0
2014-01-02   A   0.0   1.0
2014-01-02   B   0.0   0.0
2014-01-03   A   0.0   0.0
2014-01-03   B   1.0   0.0
2014-01-04   A   0.0   0.0
2014-01-04   B   0.0   1.0

The .value keeps the part of the column associated with it as header, while the rest goes into the type column. .value将与其关联的列的一部分保留为 header，而 rest 进入type列。

Pandas：重塑dataframe，在一列和一个变量中拆分列

问题描述

2 个解决方案

解决方案1
3 已采纳 2018-10-17 15:15:32

解决方案2
0 2022-03-21 11:16:08

Pandas：重塑dataframe，在一列和一个变量中拆分列

问题描述

2 个解决方案

解决方案1 3 已采纳 2018-10-17 15:15:32

解决方案2 0 2022-03-21 11:16:08

解决方案1
3 已采纳 2018-10-17 15:15:32

解决方案2
0 2022-03-21 11:16:08