[英]“Dynamic” Function to Calculate Multiple Columns Simultaneously in R
I have an issue that I'm completely stumped on where/how to start.我有一个问题,我完全不知道从哪里/如何开始。 I'm given a dataset which looks like the picture below.我得到了一个如下图所示的数据集。
In the picture the column Current is the starting dollar balance, Paid is the monthly amount that was paid and is always a fixed amount, %ofCurrentThatisExtra is a predicted % of the Current that will be paid in addition to the required $10 that is due on each row, ExtraPayment is the Current multiplied by the %ofCurrentThatisExtra , Outstanding is the Current minus Paid minus ExtraPayment .在图中, Current列是起始美元余额, Paid是每月支付的金额,并且始终是固定金额, %ofCurrentThatisExtra是除了所需的 10 美元到期之外,预计将支付的Current百分比每行, ExtraPayment是Current乘以%ofCurrentThatisExtra , Outstanding是Current减去Paid减去ExtraPayment 。
I'm given the first row's starting Current (100).我得到了第一行的起始电流(100)。 What I need to do is calculate the ExtraPayment and Outstanding .我需要做的是计算ExtraPayment和杰出的。 Again, if you look at the picture below, the Current is equal to the lag Outstanding .再一次,如果你看下图, Current等于 lag Effective 。
What I need to do is write a script in R that will do this.我需要做的是在 R 中编写一个脚本来执行此操作。 But what has me stumped is how to properly write a function which will first calculate ExtraPayment then the Outstanding followed by grabbing the lagged Outstanding and recalculating these columns for the next row.但让我难过的是如何正确编写 function ,它将首先计算ExtraPayment然后是未完成,然后获取滞后的未完成并为下一行重新计算这些列。 It seems like I need two if_else
statements calculating two different columns at the same time and I'm not sure if thats possible in R.似乎我需要两个if_else
语句同时计算两个不同的列,我不确定在 R 中是否可行。 Does anyone have any ideas?有没有人有任何想法?
Doing this with a for
loop makes the most sense to me.使用for
循环执行此操作对我来说最有意义。 First, here's the data首先,这是数据
dat <- tibble(
current = c(100, rep(NA, 6)),
paid = 10,
pct_extra = c(.02,.05,.05,.07, .03, .01, .09),
ExtraPayment = NA,
Outstanding = NA
)
dat
# # A tibble: 7 x 5
# current paid pct_extra ExtraPayment Outstanding
# <dbl> <dbl> <dbl> <lgl> <lgl>
# 1 100 10 0.02 NA NA
# 2 NA 10 0.05 NA NA
# 3 NA 10 0.05 NA NA
# 4 NA 10 0.07 NA NA
# 5 NA 10 0.03 NA NA
# 6 NA 10 0.01 NA NA
# 7 NA 10 0.09 NA NA
Now, the loop that does the work:现在,完成工作的循环:
for(i in 1:nrow(dat)){
dat$ExtraPayment[i] <- dat$current[i]*dat$pct_extra[i]
dat$Outstanding[i] <- dat$current[i] - dat$paid[i] - dat$ExtraPayment[i]
if(i < nrow(dat)){
dat$current[(i+1)] <- dat$Outstanding[i]
}
}
dat
# # A tibble: 7 x 5
# current paid pct_extra ExtraPayment Outstanding
# <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 100 10 0.02 2 88
# 2 88 10 0.05 4.4 73.6
# 3 73.6 10 0.05 3.68 59.9
# 4 59.9 10 0.07 4.19 45.7
# 5 45.7 10 0.03 1.37 34.4
# 6 34.4 10 0.01 0.344 24.0
# 7 24.0 10 0.09 2.16 11.8
You can also use a datastep , which basically works the same as a for loop, but is a little cleaner.您也可以使用datastep ,它的工作原理与 for 循环基本相同,但更简洁一些。
Data:数据:
dat <- tibble(
current = c(100, rep(NA, 6)),
paid = 10,
pct_extra = c(.02,.05,.05,.07, .03, .01, .09)
)
Calculations:计算:
library(libr)
dat2 <- datastep(dat,
retain = list(Outstanding = 0),
{
# Assign current value
if (n. > 1)
current <- Outstanding
# Calculate extra payment
ExtraPayment <- current * pct_extra
# Calculate outstanding balance
Outstanding <- current - paid - ExtraPayment
})
dat2
# # A tibble: 7 x 5
# current paid pct_extra Outstanding ExtraPayment
# <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 100 10 0.02 88 2
# 2 88 10 0.05 73.6 4.4
# 3 73.6 10 0.05 59.9 3.68
# 4 59.9 10 0.07 45.7 4.19
# 5 45.7 10 0.03 34.4 1.37
# 6 34.4 10 0.01 24.0 0.344
# 7 24.0 10 0.09 11.8 2.16
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.