简体   繁体   中英

How do I calculate percentage change of a timeseries of daily data

I have a daily timeseries of indexdata and want to take yearly pct changes of it. If I use DataFrame.pct_change(periods=...) I will have to define the exact number of days till the same day last year which is not correct as the number of working days differs from year to year. Do anyone have any idea how to get the changes from the same day a year back?

The code may look like:

import pandas as pd

list=[]
list=[[7.71],[7.79],[6.80],[6.44],[6.46],[6.80]]
df = pd.DataFrame(list, columns=['index'], index=['2016-01-04','2016-01-05','2016-01-06','2017-01-04','2017-01-05','2017-01-06'])

and I want the output as follows:

2017-01-04  -16.45%
2017-01-05  -17.10%
2017-01-05    0.00% 

First, some recommendations:

  1. don't use list as a list name because you'll overwrite the built-in list .
  2. don't use index as a column name because in pandas index contains the rows identification. Also, it can be confusing since one can access a column with df.column_name but in this case it would not be possible because df.index contains the dataframe index.
  3. Make sure to use pandas.DateTimeIndex when creating a datetime index.
l = [7.71, 7.79, 6.80, 6.44, 6.46, 6.80] 
index = pd.DatetimeIndex(['2016-01-04', '2016-01-05', '2016-01-06', 
                          '2017-01-04' ,'2017-01-05' ,'2017-01-06'])
df = pd.DataFrame(l, columns=['col'], index = index) 

Now, you can use pandas.DataFrame.pct_change with a pandas.DateOffset object:

do = pd.DateOffset(years = 1)
df.pct_change(freq = do).dropna().mul(100)

Output

                index
2017-01-04 -16.472114
2017-01-05 -17.073171
2017-01-06   0.000000

EDIT : starting from the good answer from @Pablo C: given OP's definition of the DataFrame, we first need to convert the index to DatetimeIndex , otherwise @Pablo C's answer will throw NotImplementedError: Not supported for type Index

import pandas as pd

list=[]
list=[[7.71],[7.79],[6.80],[6.44],[6.46],[6.80]]
df = pd.DataFrame(list, columns=['index'], index=['2016-01-04','2016-01-05','2016-01-06','2017-01-04','2017-01-05','2017-01-06'])

df.index = pd.to_datetime(df.index)

do = pd.DateOffset(years = 1)
df.pct_change(freq = do).dropna().mul(100)


#               index
# 2017-01-04    -16.472114
# 2017-01-05    -17.073171
# 2017-01-06    0.000000

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM