简体   繁体   中英

pandas reshape dataframe with timeseries index

The purpose of this script is to read a csv file which look like this:

Unnamed: 0         Release Date                               Event actual
0           0  2021-02-26 13:30:00                   Canada RMPI (MoM)   5.7%
1           1  2021-01-03 06:30:00  Canada Investing.com USD/CAD Index  37.8%
2           2  2021-01-03 13:30:00              Canada Current Account  -7.3B

the thing is I want it like that:

Release Date      Canada RMPI (MoM)   Canada Investing.com USD/CAD Index  Canada Current Account
2021-02-26 13:30:00          5.7%                            
2021-01-03 06:30:00                                    37.8%
2021-01-03 13:30:00                                                           -7.3B

and when some events did happen at the same time to be stored at the same row

so I tried this code:

import pandas as pd
df = pd.read_csv('df.csv')
df = pd.melt(df, id_vars=["Release Date"], var_name='event', value_name='actual')
print(df)

but that's what I got:

              Release Date       event                              actual
0  2021-02-26 13:30:00  Unnamed: 0                                   0
1  2021-01-03 06:30:00  Unnamed: 0                                   1
2  2021-01-03 13:30:00  Unnamed: 0                                   2
3  2021-02-26 13:30:00       Event                   Canada RMPI (MoM)
4  2021-01-03 06:30:00       Event  Canada Investing.com USD/CAD Index
5  2021-01-03 13:30:00       Event              Canada Current Account
6  2021-02-26 13:30:00      actual                                5.7%
7  2021-01-03 06:30:00      actual                               37.8%
8  2021-01-03 13:30:00      actual                               -7.3B

with no error at all.

If you want to use a column as the index of your DataFrame, try df.set_index('column_name') . Running your .melt code afterwards should yield your desired results.

See examples here: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.set_index.html

Here is the answer:

import pandas as pd

df = pd.read_csv('df.csv')
df = df.pivot(index="Release Date", columns="Event", values="actual")
print(df)

You can stack and unstack;

df.set_index(['Release Date','Event']).stack().unstack('Event').fillna('')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM