Lets say I have a dataframe where I have banking information. I have present value, a list of transactions, and I want to work backwards to calculate the balance over time.
Here is the dataframe:
value CN_running_balance
2020-08-07 -50.82 843.70
2020-08-06 893.77 NaN
2020-08-05 0.00 NaN
2020-08-04 -9.56 NaN
2020-08-03 -12.21 NaN
... ... ...
2020-05-14 1224.78 NaN
2020-05-13 0.00 NaN
2020-05-12 0.00 NaN
2020-05-11 -25.00 NaN
2020-05-10 -0.00 NaN
And I want to transform the running balance to use that rows value column for next row, by subtracting the value from the balance.
value CN_running_balance
2020-08-07 -50.82 843.70
2020-08-06 893.77 894.52
2020-08-05 0.00 0.75
2020-08-04 -9.56 etc
2020-08-03 -12.21 etc
... ... ...
2020-05-14 1224.78 etc
2020-05-13 0.00 etc
2020-05-12 0.00 etc
2020-05-11 -25.00 etc
2020-05-10 -0.00 etc
This has been pretty tricky for me so I would appreciate any suggestions on how to solve the problem!
I would suggest simply iterating over your dataframe as you have to forward fill the value. Here is one approach:
from io import StringIO
import pandas as pd
# Create your data frame
input_string = """value CN_running_balance
2020-08-07 -50.82 843.70
2020-08-06 893.77 NaN
2020-08-05 0.00 NaN
2020-08-04 -9.56 NaN
2020-08-03 -12.21 NaN
2020-05-14 1224.78 NaN
2020-05-13 0.00 NaN
2020-05-12 0.00 NaN
2020-05-11 -25.00 NaN
2020-05-10 -0.00 NaN"""
data = StringIO(input_string)
df = pd.read_csv(data, sep=' ', engine='python')
# Create iterable index
df = df.reset_index()
# Forward fill running balance
for i in range(1, len(df)):
df.loc[i, 'CN_running_balance'] = df.loc[i-1, 'CN_running_balance'] - df.loc[i-1, 'value']
# Reset original index
df = df.set_index('index')
print(df)
Output:
value CN_running_balance
index
2020-08-07 -50.82 843.70
2020-08-06 893.77 894.52
2020-08-05 0.00 0.75
2020-08-04 -9.56 0.75
2020-08-03 -12.21 10.31
2020-05-14 1224.78 22.52
2020-05-13 0.00 -1202.26
2020-05-12 0.00 -1202.26
2020-05-11 -25.00 -1202.26
2020-05-10 -0.00 -1177.26
Step 1: Calculate the delta between the current and previous value
delta = value - CN_running_balance[,2]
df$CN_running_balance[1] = value[1] # set first row in running balance to be equal to first CN row
for (i in 2:nrow(df)) { # for every other row
df$CN_running_balance[i] = df$CN_running_balance[i-1] + delta[i] # use previous value plus the difference of current and prev values
}
Step 2: Adjust the final CN running balance column by subtracting the initial balance from that row
df$CN_running_balance = df$CN_running_balance - df$CN_running_balance[1] # take the first row and make all other rows have their running balance minus the first one.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.