简体   繁体   中英

Pivot/ Unstack a Pandas Dataframe in Python

I have the following dataframe

                            01/01/2017             02/01/2017
 Productid   ProductName    Sales     Discount     Sales     Discount
 1           abc            100       12           234       23
 2           xyz            156       13           237       13
 3           pqr            300       12           198       18

I need to convert this into the following dataframe.

 Productid   ProductName    Date          Sales      Discount
 1           abc            01/01/2017    100        12
 1           abc            02/01/2017    234        23
 2           xyz            01/01/2017    156        13
 2           xyz            02/01/2017    237        13
 3           pqr            01/01/2017    300        12
 3           pqr            02/01/2017    198        18

How can I do this in Python?

Multi-index are difficult to reproduce directly. So first initializing the dataframe as per the OP's original dataframe.

df = pd.read_clipboard() #reading part of OP's Dataframe
df
    Productid   ProductName Sales   Discount    Sales.1 Discount.1
0           1           abc   100         12        234         23
1           2           xyz   156         13        237         13
2           3           pqr   300         12        198         18

df.columns = ['Productid', 'ProductName', 'Sales', 'Discount', 'Sales', 'Discount']
df.set_index(keys=['Productid','ProductName'],inplace=True)
df
                         Sales  Discount    Sales   Discount
Productid   ProductName             
        1           abc    100        12      234         23
        2           xyz    156        13      237         13
        3           pqr    300        12      198         18

array = [['01/01/2017','01/01/2017','02/01/2017','02/01/2017'],
         ['Sales', 'Discount', 'Sales',  'Discount']]
df.columns = pd.MultiIndex.from_arrays(array) #setting multi-index

Assuming this is the OP's Dataframe:

df
                         01/01/2017         02/01/2017
                         Sales  Discount    Sales   Discount
Productid   ProductName             
        1           abc    100        12      234         23
        2           xyz    156        13      237         13
        3           pqr    300        12      198         18

Solution using stack and level=0 parameter, then reset_index() on level=[0,1] and reset_index() again. Finally changing name of index column to Date using rename :

df = df.stack(level=0).reset_index(level=[0,1]).reset_index()
df.rename(columns={'index':'Date'},inplace=True)
df[['Productid', 'ProductName','Date','Sales','Discount']]

    Productid   ProductName       Date  Sales   Discount
0           1           abc 01/01/2017    100         12
1           1           abc 02/01/2017    234         23
2           2           xyz 01/01/2017    156         13
3           2           xyz 02/01/2017    237         13
4           3           pqr 01/01/2017    300         12
5           3           pqr 02/01/2017    198         18

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM