繁体   English   中英

R 中,如何向数据集添加一列,该数据集从一列中添加值并从另一列中减去值?

[英]How to add a column to a dataset which adds values from one column and subtracts values from another column in R?

这是我拥有的数据类型的一个小例子:

transactions <- tibble(id = seq(1:7),
                       day = paste(rep("day", each = 7), seq(1:7), sep = ""),
                       sent_to = c(NA, "Garden Cinema", "Pasta House", NA, "Blue Superstore", "Jane", "Joe"),
                       received_from = c("ATM", NA, NA, "Sarah", NA, NA, NA),
                       reference = c("add_cash", "cinema_tickets", "meal", "gift", "shopping", "reimbursed", "reimbursed"),
                       decrease = c(NA, 10.8, 12.5, NA, 15.25, NA, NA),
                       increase = c(50, NA, NA, 30, NA, 5.40, 7.25))

# # A tibble: 7 × 7
#      id   day   sent_to         received_from reference      decrease  increase
#    <int>  <chr> <chr>           <chr>         <chr>          <dbl>     <dbl>   
# 1     1   day1  NA              ATM           add_cash       NA        50      
# 2     2   day2  Garden Cinema   NA            cinema_tickets 10.8      NA      
# 3     3   day3  Pasta House     NA            meal           12.5      NA      
# 4     4   day4  NA              Sarah         gift           NA        30      
# 5     5   day5  Blue Superstore NA            shopping       15.2      NA      
# 6     6   day6  Jane            NA            reimbursed     NA        5.4     
# 7     7   day7  Joe             NA            reimbursed     NA        7.25    

我想在此数据集中添加一个“余额”列,其中:

  • 第 1 行:以 50 开头
  • 第2行:有之前的余额+增加-减少
  • 第 3 行等:与第 2 行公式相同

我一直在努力自己做这件事,因为我不知道是否有任何现有的功能可以帮助处理这种类型的数据。 唯一想到的 function 是dplyr::lag()但我不确定如何使用它。

任何帮助表示赞赏:)

您可以首先创建一个change列,然后使用purrr::accumulate创建您的balance列:

library(dplyr, warn = FALSE)
library(purrr)

transactions |> 
  mutate(change = coalesce(increase, -decrease),
         balance = accumulate(change, ~ .x + .y))
#> # A tibble: 7 × 9
#>      id day   sent_to         received_…¹ refer…² decre…³ incre…⁴ change balance
#>   <int> <chr> <chr>           <chr>       <chr>     <dbl>   <dbl>  <dbl>   <dbl>
#> 1     1 day1  <NA>            ATM         add_ca…    NA     50     50       50  
#> 2     2 day2  Garden Cinema   <NA>        cinema…    10.8   NA    -10.8     39.2
#> 3     3 day3  Pasta House     <NA>        meal       12.5   NA    -12.5     26.7
#> 4     4 day4  <NA>            Sarah       gift       NA     30     30       56.7
#> 5     5 day5  Blue Superstore <NA>        shoppi…    15.2   NA    -15.2     41.4
#> 6     6 day6  Jane            <NA>        reimbu…    NA      5.4    5.4     46.8
#> 7     7 day7  Joe             <NA>        reimbu…    NA      7.25   7.25    54.1
#> # … with abbreviated variable names ¹​received_from, ²​reference, ³​decrease,
#> #   ⁴​increase

您可以首先对列进行change ,在减少的情况下具有负值,在增加的情况下具有正值。 反过来,您可以使用cumsum function 为balance列创建累计总数。

transactions <- transactions %>%
  mutate(
    change = case_when( 
      !is.na(decrease) ~ -1*decrease, #make  values negative if decrease 
      !is.na(increase) ~ increase),
    balance = cumsum(change))

Output:

> transactions
# A tibble: 7 × 9
     id day   sent_to         received_from reference      decrease increase change balance
  <int> <chr> <chr>           <chr>         <chr>             <dbl>    <dbl>  <dbl>   <dbl>
1     1 day1  NA              ATM           add_cash           NA      50     50       50  
2     2 day2  Garden Cinema   NA            cinema_tickets     10.8    NA    -10.8     39.2
3     3 day3  Pasta House     NA            meal               12.5    NA    -12.5     26.7
4     4 day4  NA              Sarah         gift               NA      30     30       56.7
5     5 day5  Blue Superstore NA            shopping           15.2    NA    -15.2     41.4
6     6 day6  Jane            NA            reimbursed         NA       5.4    5.4     46.8
7     7 day7  Joe             NA            reimbursed         NA       7.25   7.25    54.1

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM