簡體   English   中英

使用堆棧功能轉換熊貓數據框

[英]Transforming pandas data frame using stack function

我有以下pandas數據框

import pandas as pd
import numpy as np
pd.np.random.seed(1)
N = 5
data = pd.DataFrame(pd.np.random.rand(N, 3), columns=['Monday', 'Wednesday', 'Friday'])
data['State'] = 'ST' + pd.Series((pd.np.arange(N) % 19).astype(str))
print data
     Monday  Wednesday    Friday State
0  0.417022   0.720324  0.000114   ST0
1  0.302333   0.146756  0.092339   ST1
2  0.186260   0.345561  0.396767   ST2
3  0.538817   0.419195  0.685220   ST3
4  0.204452   0.878117  0.027388   ST4

我想將此數據框轉換為

0   ST0   Monday           0.417022
          Wednesday       0.7203245
          Friday       0.0001143748
1   ST1   Monday          0.3023326
          Wednesday       0.1467559
          Friday         0.09233859
2   ST2   Monday          0.1862602
          Wednesday       0.3455607
          Friday          0.3967675
          State                 ST2
3   ST3   Monday          0.5388167
          Wednesday       0.4191945
          Friday          0.6852195
          State                 ST3
4   ST4   Monday          0.2044522
          Wednesday       0.8781174
          Friday         0.02738759
          State                 ST4

如果單獨使用data.stack() ,它將給出類似的結果,

0  Monday           0.417022
   Wednesday       0.7203245
   Friday       0.0001143748
   State                 ST0
1  Monday          0.3023326
   Wednesday       0.1467559
   Friday         0.09233859
   State                 ST1
2  Monday          0.1862602
   Wednesday       0.3455607
   Friday          0.3967675
   State                 ST2
3  Monday          0.5388167
   Wednesday       0.4191945
   Friday          0.6852195
   State                 ST3
4  Monday          0.2044522
   Wednesday       0.8781174
   Friday         0.02738759
   State                 ST4

在這里,我如何在多索引中選擇“ State列作為第一級,將其他列選擇為第二級。

您只需要在堆疊之前將State列移入索引:

data.set_index('State', append=True).stack()
Out[4]: 
   State           
0  ST0    Monday       0.417022
          Wednesday    0.720324
          Friday       0.000114
1  ST1    Monday       0.302333
          Wednesday    0.146756
          Friday       0.092339
2  ST2    Monday       0.186260
          Wednesday    0.345561
          Friday       0.396767
3  ST3    Monday       0.538817
          Wednesday    0.419195
          Friday       0.685220
4  ST4    Monday       0.204452
          Wednesday    0.878117
          Friday       0.027388
dtype: float64

請注意,這與您發布的輸出不完全匹配,我沒有將State包含在日期中,因為我認為這樣更明智,如果您真的希望像原始輸出一樣,則為: data.set_index('State', append=True, drop=False).stack()

您可以在State列上使用melt

In [24]: pd.melt(df, id_vars=['State'])
Out[24]:
   State   variable     value
0    ST0     Monday  0.417022
1    ST1     Monday  0.302333
2    ST2     Monday  0.186260
3    ST3     Monday  0.538817
4    ST4     Monday  0.204452
5    ST0  Wednesday  0.720324
6    ST1  Wednesday  0.146756
7    ST2  Wednesday  0.345561
8    ST3  Wednesday  0.419195
9    ST4  Wednesday  0.878117
10   ST0     Friday  0.000114
11   ST1     Friday  0.092339
12   ST2     Friday  0.396767
13   ST3     Friday  0.685220
14   ST4     Friday  0.027388

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM