I have the following pandas
dataframe with me
import pandas as pd
import numpy as np
pd.np.random.seed(1)
N = 5
data = pd.DataFrame(pd.np.random.rand(N, 3), columns=['Monday', 'Wednesday', 'Friday'])
data['State'] = 'ST' + pd.Series((pd.np.arange(N) % 19).astype(str))
print data
Monday Wednesday Friday State
0 0.417022 0.720324 0.000114 ST0
1 0.302333 0.146756 0.092339 ST1
2 0.186260 0.345561 0.396767 ST2
3 0.538817 0.419195 0.685220 ST3
4 0.204452 0.878117 0.027388 ST4
I want to convert this dataframe to
0 ST0 Monday 0.417022
Wednesday 0.7203245
Friday 0.0001143748
1 ST1 Monday 0.3023326
Wednesday 0.1467559
Friday 0.09233859
2 ST2 Monday 0.1862602
Wednesday 0.3455607
Friday 0.3967675
State ST2
3 ST3 Monday 0.5388167
Wednesday 0.4191945
Friday 0.6852195
State ST3
4 ST4 Monday 0.2044522
Wednesday 0.8781174
Friday 0.02738759
State ST4
If use data.stack()
alone, it will give something like,
0 Monday 0.417022
Wednesday 0.7203245
Friday 0.0001143748
State ST0
1 Monday 0.3023326
Wednesday 0.1467559
Friday 0.09233859
State ST1
2 Monday 0.1862602
Wednesday 0.3455607
Friday 0.3967675
State ST2
3 Monday 0.5388167
Wednesday 0.4191945
Friday 0.6852195
State ST3
4 Monday 0.2044522
Wednesday 0.8781174
Friday 0.02738759
State ST4
Here how can i select State
column as first level and the other columns in second level in the multi-index.
You just need to move the State column into the index before stacking:
data.set_index('State', append=True).stack()
Out[4]:
State
0 ST0 Monday 0.417022
Wednesday 0.720324
Friday 0.000114
1 ST1 Monday 0.302333
Wednesday 0.146756
Friday 0.092339
2 ST2 Monday 0.186260
Wednesday 0.345561
Friday 0.396767
3 ST3 Monday 0.538817
Wednesday 0.419195
Friday 0.685220
4 ST4 Monday 0.204452
Wednesday 0.878117
Friday 0.027388
dtype: float64
Note that this doesn't exactly match the output you posted, I haven't included the State alongside the days as I think it's more sensible this way, if you really want it like your original output it would be: data.set_index('State', append=True, drop=False).stack()
You could use melt
on State
Column like
In [24]: pd.melt(df, id_vars=['State'])
Out[24]:
State variable value
0 ST0 Monday 0.417022
1 ST1 Monday 0.302333
2 ST2 Monday 0.186260
3 ST3 Monday 0.538817
4 ST4 Monday 0.204452
5 ST0 Wednesday 0.720324
6 ST1 Wednesday 0.146756
7 ST2 Wednesday 0.345561
8 ST3 Wednesday 0.419195
9 ST4 Wednesday 0.878117
10 ST0 Friday 0.000114
11 ST1 Friday 0.092339
12 ST2 Friday 0.396767
13 ST3 Friday 0.685220
14 ST4 Friday 0.027388
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.