[英]Calculating real dollar values based on nominal values in R
我有一个包含一些变量的数据集
Plan <- c("A","A","A","B","B","B","B")
Plan_Period <- c(1,2,3,1,2,3,4)
Plan_Elapsed_time <- c(0.5,1,0.25,1,0.5,0.3,0.25)
year <- c(2016,2017,2018,2015,2016,2017,2018)
Inflation <- c(1.014,1.012,1.012,1.013,1.012, 1.080,1.020)
Cost <- c(10,20,30,40,40,50,60)
data <- data.frame(Plan, Plan_Period, Plan_Elapsed_time, year, Inflation, Cost)
计划A的美元价值从名义金额转换为实际金额的公式如下:
期间1的实际值: 10*(1.014^0.5)*(1.012^1)*(1.012^0.25)
期间2的实际值: 20*(1.012^1)*(1.012^0.25)
期间3的实际价值: 30*(1.012^0.25)
我想使用其他函数在具有1000多个不同计划的数据集上执行此操作,而不是编写for循环。
我感谢您的帮助!
使用Base R:我们也可以使用tidyverse
data=data.frame(Plan,Plan_Period ,Plan_Elapsed_time,year, Inflation,Cost
transform(data,m=Cost*ave(Inflation^Plan_Elapsed_time,Plan,
FUN=function(x)rev(cumprod(rev((x))))))
Plan Plan_Period Plan_Elapsed_time year Inflation Cost m
1 A 1 0.50 2016 1.014 10 10.22103
2 A 2 1.00 2017 1.012 20 20.30045
3 A 3 0.25 2018 1.012 30 30.08960
4 B 1 1.00 2015 1.013 40 41.92150
5 B 2 0.50 2016 1.012 40 41.38352
6 B 3 0.30 2017 1.080 50 51.42179
7 B 4 0.25 2018 1.020 60 60.29778
library(tidyverse)
data%>%
group_by(Plan)%>%
mutate(m=Cost*rev(cumprod(rev(Inflation^Plan_Elapsed_time))))
# A tibble: 7 x 7
# Groups: Plan [2]
Plan Plan_Period Plan_Elapsed_time year Inflation Cost m
<fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 A 1.00 0.500 2016 1.01 10.0 10.2
2 A 2.00 1.00 2017 1.01 20.0 20.3
3 A 3.00 0.250 2018 1.01 30.0 30.1
4 B 1.00 1.00 2015 1.01 40.0 41.9
5 B 2.00 0.500 2016 1.01 40.0 41.4
6 B 3.00 0.300 2017 1.08 50.0 51.4
7 B 4.00 0.250 2018 1.02 60.0 60.3
library(data.table)
setDT(data)[,m:=(Cost*rev(cumprod(rev(Inflation^Plan_Elapsed_time)))),by=Plan][]
Plan Plan_Period Plan_Elapsed_time year Inflation Cost m
1: A 1 0.50 2016 1.014 10 10.22103
2: A 2 1.00 2017 1.012 20 20.30045
3: A 3 0.25 2018 1.012 30 30.08960
4: B 1 1.00 2015 1.013 40 41.92150
5: B 2 0.50 2016 1.012 40 41.38352
6: B 3 0.30 2017 1.080 50 51.42179
7: B 4 0.25 2018 1.020 60 60.29778
不需要循环。 使用data.table
包,您可以按组计算累积的通货膨胀,然后将结果乘以成本:
data <- data.frame(
Plan = c("A","A","A","B","B","B","B"),
Plan_Period=c(1,2,3,1,2,3,4),
Plan_Elapsed_time=c(0.5,1,0.25,1,0.5,0.3,0.25),
year=c(2016,2017,2018,2015,2016,2017,2018),
Inflation= c(1.014,1.012,1.012,1.013,1.012, 1.080,1.020),
Cost= c(10,20,30,40,40,50,60)
)
library(data.table)
setDT(data)
data <- data[order(Plan, -Plan_Period)][, Cum_Inflation := cumprod(Inflation^Plan_Elapsed_time), by = Plan][, Real_Cost := Cost * Cum_Inflation]
print(data)
#> Plan Plan_Period Plan_Elapsed_time year Inflation Cost Cum_Inflation Real_Cost
#> 1: A 3 0.25 2018 1.012 30 1.002987 30.08960
#> 2: A 2 1.00 2017 1.012 20 1.015022 20.30045
#> 3: A 1 0.50 2016 1.014 10 1.022103 10.22103
#> 4: B 4 0.25 2018 1.020 60 1.004963 60.29778
#> 5: B 3 0.30 2017 1.080 50 1.028436 51.42179
#> 6: B 2 0.50 2016 1.012 40 1.034588 41.38352
#> 7: B 1 1.00 2015 1.013 40 1.048038 41.92150
基于@Sathish的评论的优化版本:
data <- data.frame(
Plan = c("A","A","A","B","B","B","B"),
Plan_Period=c(1,2,3,1,2,3,4),
Plan_Elapsed_time=c(0.5,1,0.25,1,0.5,0.3,0.25),
year=c(2016,2017,2018,2015,2016,2017,2018),
Inflation= c(1.014,1.012,1.012,1.013,1.012, 1.080,1.020),
Cost= c(10,20,30,40,40,50,60)
)
library(data.table)
setDT(data)[order(Plan, -Plan_Period), real_val := Cost * cumprod( Inflation ^ Plan_Elapsed_time ), by = .(Plan)]
data
#> Plan Plan_Period Plan_Elapsed_time year Inflation Cost real_val
#> 1: A 1 0.50 2016 1.014 10 10.22103
#> 2: A 2 1.00 2017 1.012 20 20.30045
#> 3: A 3 0.25 2018 1.012 30 30.08960
#> 4: B 1 1.00 2015 1.013 40 41.92150
#> 5: B 2 0.50 2016 1.012 40 41.38352
#> 6: B 3 0.30 2017 1.080 50 51.42179
#> 7: B 4 0.25 2018 1.020 60 60.29778
这是一个tidyverse
解决方案。 我认为我理解了所解释的逻辑,即较大的计划周期是较新的,因此要乘以的inflation ^ plan_elapsed1
较少。 在这里,我们arrange
获取按plan
,然后按plan_period
排序的行,然后使用cumprod
制定正确的条件以乘以cost
。
library(tidyverse)
data <- tibble(
Plan = c("A","A","A","B","B","B","B"),
Plan_Period = c(1,2,3,1,2,3,4),
Plan_Elapsed_time = c(0.5,1,0.25,1,0.5,0.3,0.25),
year = c(2016,2017,2018,2015,2016,2017,2018),
Inflation = c(1.014,1.012,1.012,1.013,1.012, 1.080,1.020),
Cost = c(10,20,30,40,40,50,60)
)
data %>%
`colnames<-`(str_to_lower(colnames(.))) %>%
mutate(deflate = inflation ^ plan_elapsed_time) %>%
group_by(plan) %>%
arrange(plan, desc(plan_period)) %>%
mutate(
cum_deflate = cumprod(deflate),
real_cost = cost * cum_deflate
) %>%
select(plan:cost, real_cost)
#> # A tibble: 7 x 7
#> # Groups: plan [2]
#> plan plan_period plan_elapsed_time year inflation cost real_cost
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 A 3. 0.250 2018. 1.01 30. 30.1
#> 2 A 2. 1.00 2017. 1.01 20. 20.3
#> 3 A 1. 0.500 2016. 1.01 10. 10.2
#> 4 B 4. 0.250 2018. 1.02 60. 60.3
#> 5 B 3. 0.300 2017. 1.08 50. 51.4
#> 6 B 2. 0.500 2016. 1.01 40. 41.4
#> 7 B 1. 1.00 2015. 1.01 40. 41.9
由reprex软件包 (v0.2.0)于2018-04-09创建。
考虑by
与内sapply
呼吁运行条件的产品:
by_list <- by(data, data$Plan, function(sub){
sub$RealValue <- sapply(sub$Plan_Period, function(i)
sub$Cost[sub$Plan_Period == i] * prod((sub$Inflation[sub$Plan_Period >= i])^(sub$Plan_Elapsed_time[sub$Plan_Period >= i]))
)
return(sub)
})
finaldata <- do.call(rbind, unname(by_list))
finaldata
# Plan Plan_Period Plan_Elapsed_time year Inflation Cost RealValue
# 1 A 1 0.50 2016 1.014 10 10.22103
# 2 A 2 1.00 2017 1.012 20 20.30045
# 3 A 3 0.25 2018 1.012 30 30.08960
# 4 B 1 1.00 2015 1.013 40 41.92150
# 5 B 2 0.50 2016 1.012 40 41.38352
# 6 B 3 0.30 2017 1.080 50 51.42179
# 7 B 4 0.25 2018 1.020 60 60.29778
不知道我是否误解了这个问题,但是为该数据编写一个for循环似乎相当简单。
n<-length(Plan)
for(i in 1:n){
if (Plan[i]=="A" & Plan_Period[i]==1){
print(10*(1.014^0.5)*(1.012^1)*(0.25^1.012))
}
else if (Plan[i]=="A" & Plan_Period[i]==2){
print(20*(1.012^1)*(0.25^1.012))
}
else if (Plan[i]=="A" & Plan_Period[i]==3){
print(30*(1.012^0.25))
}
else {
print(0)
}
}
希望这会有所帮助!
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.