How do I divide a column by its value at a particular date, when the dataframe is in a long format?

Question

My dataframe has 2 indices, date and country. Let's call the data column 'd'.

What I want to do is to divide the data in 'd' by the value of 'd' at some fixed date t. Basically I want to rescale the series such that its value is 1 at date t.

I tried groupby:

df['d2'] = df['d']/df.groupby(level='country')['d'].loc['t']

which fails of course because groupby doesn't have attribute .loc. What should I do here?

Edit: example of my dataframe

date        country
2020-04-01  US          93.872715
2020-07-01  US         100.957790
2020-10-01  US         102.083749
2021-01-01  US         103.649602
2021-04-01  US         105.350228
   
2020-07-01  IL         101.168879
2020-10-01  IL         103.576224
2021-01-01  IL         103.212359
2021-04-01  IL         107.240892
2021-07-01  IL                NaN

I want to scale the data by the value at date '2020-07-01', so that the US data should be 93.87/100.96, 1, 102.08/100.96... and for IL, 101.17/103.58, 1, 103.21/103.58 ... and so on. I hope that makes sense.

Answer 1

you can select the wanted date and column with loc , then map the country level index. divide the column d by this

df['norm_d'] = df['d']/df.index.get_level_values('country').map(df.loc['2020-07-01','d'])
print(df)
                             d    norm_d
date       country                      
2020-04-01 US        93.872715  0.929821
2020-07-01 US       100.957790  1.000000
2020-10-01 US       102.083749  1.011153
2021-01-01 US       103.649602  1.026663
2021-04-01 US       105.350228  1.043508
2020-07-01 IL       101.168879  1.000000
2020-10-01 IL       103.576224  1.023795
2021-01-01 IL       103.212359  1.020199
2021-04-01 IL       107.240892  1.060019

Here to see what is happening

print(df.loc['2020-07-01','d'])
# country
# US    100.957790
# IL    101.168879
# Name: d, dtype: float64

print(df.index.get_level_values('country').map(df.loc['2020-07-01','d']))
# Float64Index([ 100.95779,  100.95779,  100.95779,  100.95779,  100.95779,
#               101.168879, 101.168879, 101.168879, 101.168879],
#              dtype='float64', name='country')

How do I divide a column by its value at a particular date, when the dataframe is in a long format?

Question

1 answers

solution1
0 2021-11-15 16:59:38

How do I divide a column by its value at a particular date, when the dataframe is in a long format?

Question

1 answers

solution1 0 2021-11-15 16:59:38

solution1
0 2021-11-15 16:59:38