简体   繁体   中英

Create a panel in pandas from a dataframe with several date/price columns

I have a dataframe in pandas that has columns asset1_date, asset1_price, asset2_date, asset2_price, etc (up to about 500 assets). asset1_date and asset2_date are not necessarily the same. I want to reformat it into a panel with one column called asset and then one column for date and one column for price, ie

pd.DataFrame({'asset':['asset1','asset1','asset2','asset2','asset2'],'date':['09/26/2003','09/29/2003','04/10/2007','04/11/2007','04/12/2007'],'price':[102,103,75,74,76]})

Currently, the data looks like:

pd.DataFrame({'asset1_date':['09/26/2003','09/29/2003',np.nan],'asset1_price':[102,103,np.nan],'asset2_date':['04/10/2007','04/11/2007','04/12/2007'],'asset2_price':[75,74,76]})

Could anyone suggest a pandas method to achieve this? Thanks!

This should do the trick:

df=df.stack().reset_index()
df["asset"]=df["level_1"].str.split("_").str[0]
df["col"]=df["level_1"].str.split("_").str[1]
df=df.set_index(["level_0", "col", "asset"]).unstack("col").reset_index("level_0", drop=True).reset_index("asset", drop=False).drop("level_1", axis=1, level=0)
#please note this following line is a bit of a brute force approach, since I'm assuming you want exactly these columns, alternative you can find in here:
#https://stackoverflow.com/a/47979382/11610186
df.columns=["asset", "date", "price"]

Output:

    asset        date price
0  asset1  09/26/2003   102
1  asset2  04/10/2007    75
2  asset1  09/29/2003   103
3  asset2  04/11/2007    74
4  asset2  04/12/2007    76

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM