简体   繁体   中英

How to add a column to a dataset which adds values from one column and subtracts values from another column in R?

Here is a small example of the kind of data I have:

transactions <- tibble(id = seq(1:7),
                       day = paste(rep("day", each = 7), seq(1:7), sep = ""),
                       sent_to = c(NA, "Garden Cinema", "Pasta House", NA, "Blue Superstore", "Jane", "Joe"),
                       received_from = c("ATM", NA, NA, "Sarah", NA, NA, NA),
                       reference = c("add_cash", "cinema_tickets", "meal", "gift", "shopping", "reimbursed", "reimbursed"),
                       decrease = c(NA, 10.8, 12.5, NA, 15.25, NA, NA),
                       increase = c(50, NA, NA, 30, NA, 5.40, 7.25))

# # A tibble: 7 × 7
#      id   day   sent_to         received_from reference      decrease  increase
#    <int>  <chr> <chr>           <chr>         <chr>          <dbl>     <dbl>   
# 1     1   day1  NA              ATM           add_cash       NA        50      
# 2     2   day2  Garden Cinema   NA            cinema_tickets 10.8      NA      
# 3     3   day3  Pasta House     NA            meal           12.5      NA      
# 4     4   day4  NA              Sarah         gift           NA        30      
# 5     5   day5  Blue Superstore NA            shopping       15.2      NA      
# 6     6   day6  Jane            NA            reimbursed     NA        5.4     
# 7     7   day7  Joe             NA            reimbursed     NA        7.25    

I would like to add a "balance" column to this dataset where:

  • Row 1: starts with 50
  • Row 2: has previous balance amount + increase - decrease
  • Row 3, etc.: same as row 2 formula

I've been struggling to do this myself as I don't know if there are any existing functions which help with this types of data manipulation. The only function that comes to mind is the dplyr::lag() but I'm not sure how to use it.

Any help is appreciated:)

You could first create a column of the change and second use purrr::accumulate to create your balance column:

library(dplyr, warn = FALSE)
library(purrr)

transactions |> 
  mutate(change = coalesce(increase, -decrease),
         balance = accumulate(change, ~ .x + .y))
#> # A tibble: 7 × 9
#>      id day   sent_to         received_…¹ refer…² decre…³ incre…⁴ change balance
#>   <int> <chr> <chr>           <chr>       <chr>     <dbl>   <dbl>  <dbl>   <dbl>
#> 1     1 day1  <NA>            ATM         add_ca…    NA     50     50       50  
#> 2     2 day2  Garden Cinema   <NA>        cinema…    10.8   NA    -10.8     39.2
#> 3     3 day3  Pasta House     <NA>        meal       12.5   NA    -12.5     26.7
#> 4     4 day4  <NA>            Sarah       gift       NA     30     30       56.7
#> 5     5 day5  Blue Superstore <NA>        shoppi…    15.2   NA    -15.2     41.4
#> 6     6 day6  Jane            <NA>        reimbu…    NA      5.4    5.4     46.8
#> 7     7 day7  Joe             <NA>        reimbu…    NA      7.25   7.25    54.1
#> # … with abbreviated variable names ¹​received_from, ²​reference, ³​decrease,
#> #   ⁴​increase

You could first make a column change that has negative values in case of a decrease and positive values in case of an increase. In turn, you could use the cumsum function to create a cumulative total for the balance column.

transactions <- transactions %>%
  mutate(
    change = case_when( 
      !is.na(decrease) ~ -1*decrease, #make  values negative if decrease 
      !is.na(increase) ~ increase),
    balance = cumsum(change))

Output:

> transactions
# A tibble: 7 × 9
     id day   sent_to         received_from reference      decrease increase change balance
  <int> <chr> <chr>           <chr>         <chr>             <dbl>    <dbl>  <dbl>   <dbl>
1     1 day1  NA              ATM           add_cash           NA      50     50       50  
2     2 day2  Garden Cinema   NA            cinema_tickets     10.8    NA    -10.8     39.2
3     3 day3  Pasta House     NA            meal               12.5    NA    -12.5     26.7
4     4 day4  NA              Sarah         gift               NA      30     30       56.7
5     5 day5  Blue Superstore NA            shopping           15.2    NA    -15.2     41.4
6     6 day6  Jane            NA            reimbursed         NA       5.4    5.4     46.8
7     7 day7  Joe             NA            reimbursed         NA       7.25   7.25    54.1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM