重塑熊猫中的数据框

Question

Let's say I have this data frame: 假设我有这个数据框：

df = pd.DataFrame({'n':[0 ,1 ,0 ,0 ,1 ,1 ,0 ,1],'l':[12 ,16 ,92, 77 ,32 ,47, 22, 14], 'cols':['col1','col1','col1','col1','col2','col2','col2','col2']})

and this is what I'm trying to get: 这就是我想要得到的：

col1    col2
l   n   l   n
12  0   32  1
16  1   47  1
92  0   22  0
77  0   14  1

I've been playing around with set_index and stack / unstack methods but with no success... 我一直在玩弄set_index和stack / unstack方法，但没有成功...

Answer 1

import pandas as pd

df = pd.DataFrame(
    {'n':[0 ,1 ,0 ,0 ,1 ,1 ,0 ,1],'l':[12 ,16 ,92, 77 ,32 ,47, 22, 14],
     'cols':['col1','col1','col1','col1','col2','col2','col2','col2']})

df['index'] = df.groupby(['cols']).cumcount()
result = df.pivot(index='index', columns='cols')
print(result)
#           l           n      
# cols   col1  col2  col1  col2
# index                        
# 0        12    32     0     1
# 1        16    47     1     1
# 2        92    22     0     0
# 3        77    14     0     1

If you care about the order of the labels in the MultiIndex column, you could use stack and unstack to exactly reproduce result you posted: 如果您关心“ MultiIndex”列中标签的顺序，则可以使用“堆栈”和“堆栈”来精确复制发布的结果：

result = result.stack(level=0).unstack(level=1)
print(result)

# cols   col1     col2   
#           l  n     l  n
# index                  
# 0        12  0    32  1
# 1        16  1    47  1
# 2        92  0    22  0
# 3        77  0    14  1

When looking for a solution it is often useful to think backwards. 在寻找解决方案时，回头思考通常会很有用。

Start with the desired DataFrame and ask yourself what operation might result in the desired DataFrame. 从所需的DataFrame开始，然后问自己，什么操作可能会导致所需的DataFrame。 In this case, the operation that came to mind was pd.pivot . 在这种情况下，想到的操作是pd.pivot 。 Then the question becomes, what DataFrame, something , is needed so that 然后问题就变成something ，需要什么DataFrame，以便

desired = something.pivot(index='index', columns='cols')

By looking at other examples of pivot in action, it became clear than something had to equal 通过查看行动pivot 其他示例，可以清楚地看到， something并非必须平等。

   cols   l  n  index
0  col1  12  0      0
1  col1  16  1      1
2  col1  92  0      2
3  col1  77  0      3
4  col2  32  1      0
5  col2  47  1      1
6  col2  22  0      2
7  col2  14  1      3

Then you see if you can find a way to massage df into something , or again working backwards, massage something into df ... From this point of view, in this case, the missing link became apparent: something has an index column that df lacked. 然后，您会发现是否可以找到一种方法来将df按摩成something ，或者再次向后工作，将something按摩成df ...从这种角度来看，在这种情况下，缺少的链接变得很明显： something的index df缺乏。

Answer 2

You can use a combination of DataFrame.groupby , DataFrame.reset_index and DataFrame.T (transpose) 您可以结合使用DataFrame.groupby ， DataFrame.reset_index和DataFrame.T （转置）

import pandas as pd

df = pd.DataFrame({'n':[0 ,1 ,0 ,0 ,1 ,1 ,0, 1],'l':[12 ,16 ,92, 77 ,32 ,47, 22, 14], 'cols':['col1','col1','col1','col1','col2','col2','col2','col2']})
print df.groupby('cols').apply(lambda x: x.reset_index(drop=True).drop('cols',axis=1).T).T

Output: 输出：

cols  col1     col2   
         l  n     l  n
0       12  0    32  1
1       16  1    47  1
2       92  0    22  0
3       77  0    14  1

Or you can use concat : 或者您可以使用concat ：

print pd.concat([g.drop('cols',axis=1).reset_index(drop=True) for _,g in df.groupby('cols')],axis=1,keys=df['cols'].unique())

Output: 输出：

   col1     col2   
      l  n     l  n
0    12  0    32  1
1    16  1    47  1
2    92  0    22  0
3    77  0    14  1

Hope it helps, :) 希望能帮助到你，：）

重塑熊猫中的数据框

问题描述

2 个解决方案

解决方案1
1 已采纳 2014-02-07 22:21:14

解决方案2
0 2014-02-10 14:48:11

重塑熊猫中的数据框

问题描述

2 个解决方案

解决方案1 1 已采纳 2014-02-07 22:21:14

解决方案2 0 2014-02-10 14:48:11

解决方案1
1 已采纳 2014-02-07 22:21:14

解决方案2
0 2014-02-10 14:48:11