My dataframe has 2 indices, date and country. Let's call the data column 'd'.
What I want to do is to divide the data in 'd' by the value of 'd' at some fixed date t. Basically I want to rescale the series such that its value is 1 at date t.
I tried groupby:
df['d2'] = df['d']/df.groupby(level='country')['d'].loc['t']
which fails of course because groupby doesn't have attribute .loc. What should I do here?
Edit: example of my dataframe
date country
2020-04-01 US 93.872715
2020-07-01 US 100.957790
2020-10-01 US 102.083749
2021-01-01 US 103.649602
2021-04-01 US 105.350228
2020-07-01 IL 101.168879
2020-10-01 IL 103.576224
2021-01-01 IL 103.212359
2021-04-01 IL 107.240892
2021-07-01 IL NaN
I want to scale the data by the value at date '2020-07-01', so that the US data should be 93.87/100.96, 1, 102.08/100.96... and for IL, 101.17/103.58, 1, 103.21/103.58 ... and so on. I hope that makes sense.
you can select the wanted date and column with loc
, then map
the country level index. divide the column d by this
df['norm_d'] = df['d']/df.index.get_level_values('country').map(df.loc['2020-07-01','d'])
print(df)
d norm_d
date country
2020-04-01 US 93.872715 0.929821
2020-07-01 US 100.957790 1.000000
2020-10-01 US 102.083749 1.011153
2021-01-01 US 103.649602 1.026663
2021-04-01 US 105.350228 1.043508
2020-07-01 IL 101.168879 1.000000
2020-10-01 IL 103.576224 1.023795
2021-01-01 IL 103.212359 1.020199
2021-04-01 IL 107.240892 1.060019
Here to see what is happening
print(df.loc['2020-07-01','d'])
# country
# US 100.957790
# IL 101.168879
# Name: d, dtype: float64
print(df.index.get_level_values('country').map(df.loc['2020-07-01','d']))
# Float64Index([ 100.95779, 100.95779, 100.95779, 100.95779, 100.95779,
# 101.168879, 101.168879, 101.168879, 101.168879],
# dtype='float64', name='country')
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.