简体   繁体   中英

Transforming pandas data frame using stack function

I have the following pandas dataframe with me

import pandas as pd
import numpy as np
pd.np.random.seed(1)
N = 5
data = pd.DataFrame(pd.np.random.rand(N, 3), columns=['Monday', 'Wednesday', 'Friday'])
data['State'] = 'ST' + pd.Series((pd.np.arange(N) % 19).astype(str))
print data
     Monday  Wednesday    Friday State
0  0.417022   0.720324  0.000114   ST0
1  0.302333   0.146756  0.092339   ST1
2  0.186260   0.345561  0.396767   ST2
3  0.538817   0.419195  0.685220   ST3
4  0.204452   0.878117  0.027388   ST4

I want to convert this dataframe to

0   ST0   Monday           0.417022
          Wednesday       0.7203245
          Friday       0.0001143748
1   ST1   Monday          0.3023326
          Wednesday       0.1467559
          Friday         0.09233859
2   ST2   Monday          0.1862602
          Wednesday       0.3455607
          Friday          0.3967675
          State                 ST2
3   ST3   Monday          0.5388167
          Wednesday       0.4191945
          Friday          0.6852195
          State                 ST3
4   ST4   Monday          0.2044522
          Wednesday       0.8781174
          Friday         0.02738759
          State                 ST4

If use data.stack() alone, it will give something like,

0  Monday           0.417022
   Wednesday       0.7203245
   Friday       0.0001143748
   State                 ST0
1  Monday          0.3023326
   Wednesday       0.1467559
   Friday         0.09233859
   State                 ST1
2  Monday          0.1862602
   Wednesday       0.3455607
   Friday          0.3967675
   State                 ST2
3  Monday          0.5388167
   Wednesday       0.4191945
   Friday          0.6852195
   State                 ST3
4  Monday          0.2044522
   Wednesday       0.8781174
   Friday         0.02738759
   State                 ST4

Here how can i select State column as first level and the other columns in second level in the multi-index.

You just need to move the State column into the index before stacking:

data.set_index('State', append=True).stack()
Out[4]: 
   State           
0  ST0    Monday       0.417022
          Wednesday    0.720324
          Friday       0.000114
1  ST1    Monday       0.302333
          Wednesday    0.146756
          Friday       0.092339
2  ST2    Monday       0.186260
          Wednesday    0.345561
          Friday       0.396767
3  ST3    Monday       0.538817
          Wednesday    0.419195
          Friday       0.685220
4  ST4    Monday       0.204452
          Wednesday    0.878117
          Friday       0.027388
dtype: float64

Note that this doesn't exactly match the output you posted, I haven't included the State alongside the days as I think it's more sensible this way, if you really want it like your original output it would be: data.set_index('State', append=True, drop=False).stack()

You could use melt on State Column like

In [24]: pd.melt(df, id_vars=['State'])
Out[24]:
   State   variable     value
0    ST0     Monday  0.417022
1    ST1     Monday  0.302333
2    ST2     Monday  0.186260
3    ST3     Monday  0.538817
4    ST4     Monday  0.204452
5    ST0  Wednesday  0.720324
6    ST1  Wednesday  0.146756
7    ST2  Wednesday  0.345561
8    ST3  Wednesday  0.419195
9    ST4  Wednesday  0.878117
10   ST0     Friday  0.000114
11   ST1     Friday  0.092339
12   ST2     Friday  0.396767
13   ST3     Friday  0.685220
14   ST4     Friday  0.027388

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM