简体   繁体   中英

How can i multiply a cell value of a dataframe based on two condition?

I have this dataframe

import numpy as np
import pandas as pd

data = {'month': ['5','5','6', '7'], 'condition': ["yes","no","yes","yes"],'amount': [500,200, 500, 500]}

and two values:

inflation5 = 1.05
inflation6 = 1.08
inflation7 = 1.08

I need to know how can i multiply the cells of column 'amount' by the value inflation5 when the column 'month' value is 5 and the column 'condition' value is "yes", and also multiply the cells of column 'amount' by the value inflation6 when the column 'month' value is 6 and the column 'condition' value is "yes", and the same with month 7. But i need that calculation for the month 6 is based in the new calculated value of month 5, and the calculation for the month 7 is based in the new calculated value of month 6. In order to explain this better, the value 500 is an estimation that needs to be updated with mensual inflation (accumulative). The expected output for column 'amount': [525,200, 567, 612.36]

Thanks

For this I would run through with an np.where, should make it easily readable, and expandable especially if you wanted to change the condition with a function.

df = pd.DataFrame(data)
df['Inflation'] = np.where((df['month'] == '5') & (df['condition'] == 'yes'), inflation5, 1)
df['Inflation'] = np.where((df['month'] == '6') & (df['condition'] == 'yes'), inflation6, df['Inflation'])
df['Total_Amount'] = df['amount'].values * df['Inflation'].values

I would suggest to use a different approach for efficiency.

Use a dictionary to store the inflations, then you can simply update in a single vectorial call:

inflations = {'5': 1.05, '6': 1.08}

mask = df['condition'].eq('yes')
df.loc[mask, 'amount'] *= df.loc[mask, 'month'].map(inflations)

NB. if you possibly have missing months in the dictionary, use df.loc[mask, 'month'].map(inflations).fillna(1) in place of df.loc[mask, 'month'].map(inflations)

output:

  month condition  amount
0     5       yes     525
1     5        no     200
2     6       yes    6480
3     7        no    1873

updated question: cumulated inflation

You can craft a series and use a cumprod :

inflations = {'5': 1.05, '6': 1.08, '7': 1.08}

mask = df['condition'].eq('yes')
s = pd.Series(inflations).cumprod()
df.loc[mask, 'amount'] *= df.loc[mask, 'month'].map(s).fillna(1)

Output:

  month condition  amount
0     5       yes  525.00
1     5        no  200.00
2     6       yes  567.00
3     7       yes  612.36

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM