简体   繁体   中英

Mutate a variable with a few conditions

I have a dataset and want to create a column called stock . In particular, the rule is the following.

  1. If an agent won the game in the previous month, they have a stock based on the quantity in the previous month.
  2. This stock vanishes after three months (eg, if an agent won on 2020-02-01 , this stock disappears when they participate in a game on 2020-06-01 .
  3. There are many agents A , B , etc.

How do I create such a row using tibble ?

date       id    win   quantity stock
<date>     <chr> <dbl> <dbl>   <dbl>
2020-01-01 A     0     60       0
2020-02-01 A     1     50       0
2020-03-01 A     0     50       50 ## have 50 for 3 months because A win in the previous month
2020-04-01 A     1     100      50 
2020-05-01 A     0     10      150 ## have 100 for 3 months because A win in the previous month
2020-06-01 A     0     10      100 ## disappear 50 after 3 months
2020-07-01 A     0     100     100 ## disappear 50 after 3 months
2020-08-01 A     0     100      0  ## disappear 100 after 3 months
2020-01-01 B     0     60       0
2020-02-01 B     0     50       0
2020-03-01 B     0     50       0  
2020-04-01 B     1     10       0 
2020-05-01 B     0     10       10 
2020-08-01 B     0     100      0

Edit: raw data

date = c("2020-01-01", "2020-02-01", "2020-03-01","2020-04-01", "2020-05-01", "2020-06-01", "2020-07-01", "2020-08-01", "2020-01-01", "2020-02-01", "2020-03-01", "2020-04-01", "2020-05-01", "2020-08-01")
id = c("A", "A", "A", "A", "A", "A", "A", "A", "B", "B", "B", "B", "B", "B")
win = c(0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0)
quantity = c(60, 50, 50, 100, 10, 10, 100, 100, 60, 50, 50, 10, 10, 100)
tibble(date = as.Date(date), id = id, win = win, quantity = quantity)

This can be done as a rolling calculation on the previous 3 rows (which means a rolling width of 4, we'll ignore the current row).

(FYI, I think your last stock value should be a 10.)

dplyr

library(dplyr)
# library(zoo) # rollapplyr
dat %>%
  group_by(id) %>%
  mutate(
    stock = zoo::rollapplyr(
      quantity * (win > 0), 4, FUN = function(z) sum(z[-length(z)]),
      by.column = FALSE, partial = TRUE)
  ) %>%
  ungroup()
# # A tibble: 14 x 5
#    date       id      win quantity stock
#    <date>     <chr> <dbl>    <dbl> <dbl>
#  1 2020-01-01 A         0       60     0
#  2 2020-02-01 A         1       50     0
#  3 2020-03-01 A         0       50    50
#  4 2020-04-01 A         1      100    50
#  5 2020-05-01 A         0       10   150
#  6 2020-06-01 A         0       10   100
#  7 2020-07-01 A         0      100   100
#  8 2020-08-01 A         0      100     0
#  9 2020-01-01 B         0       60     0
# 10 2020-02-01 B         0       50     0
# 11 2020-03-01 B         0       50     0
# 12 2020-04-01 B         1       10     0
# 13 2020-05-01 B         0       10    10
# 14 2020-08-01 B         0      100    10

base R

(But still using zoo::rollapplyr .)

dat$stock <- 
  ave(dat$quantity * (dat$win > 0), dat$id, FUN = function(z) {
    zoo::rollapplyr(z, 4, FUN = function(z) sum(z[-length(z)]),
                    by.column = FALSE, partial = TRUE)
  })

data.table

library(data.table)
DT <- as.data.table(dat)
DT[, stock := zoo::rollapplyr(quantity * (win > 0), 4, FUN = function(z) sum(z[-length(z)]),
                             by.column = FALSE, partial = TRUE),
   by = .(id) ]

(I'm using DT <- as.data.table(dat) here in case the user chooses to use this without realizing the in-place change setDT(dat) does to the data. I believe that setDT(dat) is the canonical and recommended way to convert from another frame into data.table -class, use that if you're going to make the switch from "tbl_df" .)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM