[英]How to add a column to a dataset which adds values from one column and subtracts values from another column in R?
这是我拥有的数据类型的一个小例子:
transactions <- tibble(id = seq(1:7),
day = paste(rep("day", each = 7), seq(1:7), sep = ""),
sent_to = c(NA, "Garden Cinema", "Pasta House", NA, "Blue Superstore", "Jane", "Joe"),
received_from = c("ATM", NA, NA, "Sarah", NA, NA, NA),
reference = c("add_cash", "cinema_tickets", "meal", "gift", "shopping", "reimbursed", "reimbursed"),
decrease = c(NA, 10.8, 12.5, NA, 15.25, NA, NA),
increase = c(50, NA, NA, 30, NA, 5.40, 7.25))
# # A tibble: 7 × 7
# id day sent_to received_from reference decrease increase
# <int> <chr> <chr> <chr> <chr> <dbl> <dbl>
# 1 1 day1 NA ATM add_cash NA 50
# 2 2 day2 Garden Cinema NA cinema_tickets 10.8 NA
# 3 3 day3 Pasta House NA meal 12.5 NA
# 4 4 day4 NA Sarah gift NA 30
# 5 5 day5 Blue Superstore NA shopping 15.2 NA
# 6 6 day6 Jane NA reimbursed NA 5.4
# 7 7 day7 Joe NA reimbursed NA 7.25
我想在此数据集中添加一个“余额”列,其中:
我一直在努力自己做这件事,因为我不知道是否有任何现有的功能可以帮助处理这种类型的数据。 唯一想到的 function 是dplyr::lag()
但我不确定如何使用它。
任何帮助表示赞赏:)
您可以首先创建一个change
列,然后使用purrr::accumulate
创建您的balance
列:
library(dplyr, warn = FALSE)
library(purrr)
transactions |>
mutate(change = coalesce(increase, -decrease),
balance = accumulate(change, ~ .x + .y))
#> # A tibble: 7 × 9
#> id day sent_to received_…¹ refer…² decre…³ incre…⁴ change balance
#> <int> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 1 day1 <NA> ATM add_ca… NA 50 50 50
#> 2 2 day2 Garden Cinema <NA> cinema… 10.8 NA -10.8 39.2
#> 3 3 day3 Pasta House <NA> meal 12.5 NA -12.5 26.7
#> 4 4 day4 <NA> Sarah gift NA 30 30 56.7
#> 5 5 day5 Blue Superstore <NA> shoppi… 15.2 NA -15.2 41.4
#> 6 6 day6 Jane <NA> reimbu… NA 5.4 5.4 46.8
#> 7 7 day7 Joe <NA> reimbu… NA 7.25 7.25 54.1
#> # … with abbreviated variable names ¹received_from, ²reference, ³decrease,
#> # ⁴increase
您可以首先对列进行change
,在减少的情况下具有负值,在增加的情况下具有正值。 反过来,您可以使用cumsum
function 为balance
列创建累计总数。
transactions <- transactions %>%
mutate(
change = case_when(
!is.na(decrease) ~ -1*decrease, #make values negative if decrease
!is.na(increase) ~ increase),
balance = cumsum(change))
Output:
> transactions
# A tibble: 7 × 9
id day sent_to received_from reference decrease increase change balance
<int> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 1 day1 NA ATM add_cash NA 50 50 50
2 2 day2 Garden Cinema NA cinema_tickets 10.8 NA -10.8 39.2
3 3 day3 Pasta House NA meal 12.5 NA -12.5 26.7
4 4 day4 NA Sarah gift NA 30 30 56.7
5 5 day5 Blue Superstore NA shopping 15.2 NA -15.2 41.4
6 6 day6 Jane NA reimbursed NA 5.4 5.4 46.8
7 7 day7 Joe NA reimbursed NA 7.25 7.25 54.1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.