简体   繁体   中英

convert columns to rows based on row value using pandas

I have an excel file as follows:

ID      wk48     wk49    wk50    wk51   wk52   wk1   wk2 
1123     10        22      233     2     4     22     11
1198      9         4       44    23    34     5     234
101       3         6        3    43    33     34     78

I want the output as follows in python

1123 wk48  10
1123 wk49  22
1123 wk50  233
1123 wk51  2
1123 wk52  4
1123 wk1   22
1123 wk2   11
1198 wk48  9
1198 wk49  4
1198 wk50  44
1198 wk51  23
1198 wk52  34
1198 wk1   5
1198 wk2   234

Any suggestions

The first thing I would do is set ID as your index with:

df.set_index('ID',inplace=True)

Next, you can use the following command to reorient your dataframe:

df = df.stack().reset_index()
print df
-------------------------------------------
Output:
      ID level_1    0
0   1123    wk48   10
1   1123    wk49   22
2   1123    wk50  233
3   1123    wk51    2
4   1123    wk52    4
5   1123     wk1   22
6   1123     wk2   11
7   1198    wk48    9
8   1198    wk49    4
9   1198    wk50   44
10  1198    wk51   23
11  1198    wk52   34
12  1198     wk1    5
13  1198     wk2  234
14   101    wk48    3
15   101    wk49    6
16   101    wk50    3
17   101    wk51   43
18   101    wk52   33
19   101     wk1   34
20   101     wk2   78

Read the data in pandas:

In[1]:
import pandas as pd
df = pd.read_excel('yourfile.xlsx', index_col=0)

Then use DataFrame.stack to move the columns to the index:

In [2]: s = df.stack()
        # Rename the index names for cleaner output
        s.index.names = ['ID','Week']

You'll get a pd.Series like you want:

In [3]: s.head(10)
Out[3]: 
ID    Week    
1123  wk48     10
      wk49     22
      wk50    233
      wk51      2
      wk52      4
      wk1      22
      wk2      11
1198  wk48      9
      wk49      4
      wk50     44
dtype: int64

In case you want that back as a dataframe:

In [4]: s.name = 'Value'
        df = s.to_frame()

You can then export that back to Excel for example:

In [5]: df.to_excel('newfile.xslx')

Or if you don't like to have ID being grouped (merged cells) in excel, you can do this:

In [6]: df.reset_index().to_excel('newfile.xlsx', index=False)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM