[英]Mutate a variable with a few conditions
I have a dataset and want to create a column called stock
.我有一个数据集,想创建一个名为
stock
的列。 In particular, the rule is the following.特别是,规则如下。
stock
based on the quantity in the previous month.stock
。stock
vanishes after three months (eg, if an agent won on 2020-02-01
, this stock disappears when they participate in a game on 2020-06-01
.stock
在三个月后消失(例如,如果经纪人在2020-02-01
,则该股票在他们参加2020-06-01
的比赛时消失。A
, B
, etc.A
, B
等。 How do I create such a row using tibble
?如何使用
tibble
创建这样的行?
date id win quantity stock
<date> <chr> <dbl> <dbl> <dbl>
2020-01-01 A 0 60 0
2020-02-01 A 1 50 0
2020-03-01 A 0 50 50 ## have 50 for 3 months because A win in the previous month
2020-04-01 A 1 100 50
2020-05-01 A 0 10 150 ## have 100 for 3 months because A win in the previous month
2020-06-01 A 0 10 100 ## disappear 50 after 3 months
2020-07-01 A 0 100 100 ## disappear 50 after 3 months
2020-08-01 A 0 100 0 ## disappear 100 after 3 months
2020-01-01 B 0 60 0
2020-02-01 B 0 50 0
2020-03-01 B 0 50 0
2020-04-01 B 1 10 0
2020-05-01 B 0 10 10
2020-08-01 B 0 100 0
Edit: raw data编辑:原始数据
date = c("2020-01-01", "2020-02-01", "2020-03-01","2020-04-01", "2020-05-01", "2020-06-01", "2020-07-01", "2020-08-01", "2020-01-01", "2020-02-01", "2020-03-01", "2020-04-01", "2020-05-01", "2020-08-01")
id = c("A", "A", "A", "A", "A", "A", "A", "A", "B", "B", "B", "B", "B", "B")
win = c(0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0)
quantity = c(60, 50, 50, 100, 10, 10, 100, 100, 60, 50, 50, 10, 10, 100)
tibble(date = as.Date(date), id = id, win = win, quantity = quantity)
This can be done as a rolling calculation on the previous 3 rows (which means a rolling width of 4, we'll ignore the current row).这可以作为对前 3 行的滚动计算来完成(这意味着滚动宽度为 4,我们将忽略当前行)。
(FYI, I think your last stock
value should be a 10.) (仅供参考,我认为你最后的
stock
价值应该是 10。)
library(dplyr)
# library(zoo) # rollapplyr
dat %>%
group_by(id) %>%
mutate(
stock = zoo::rollapplyr(
quantity * (win > 0), 4, FUN = function(z) sum(z[-length(z)]),
by.column = FALSE, partial = TRUE)
) %>%
ungroup()
# # A tibble: 14 x 5
# date id win quantity stock
# <date> <chr> <dbl> <dbl> <dbl>
# 1 2020-01-01 A 0 60 0
# 2 2020-02-01 A 1 50 0
# 3 2020-03-01 A 0 50 50
# 4 2020-04-01 A 1 100 50
# 5 2020-05-01 A 0 10 150
# 6 2020-06-01 A 0 10 100
# 7 2020-07-01 A 0 100 100
# 8 2020-08-01 A 0 100 0
# 9 2020-01-01 B 0 60 0
# 10 2020-02-01 B 0 50 0
# 11 2020-03-01 B 0 50 0
# 12 2020-04-01 B 1 10 0
# 13 2020-05-01 B 0 10 10
# 14 2020-08-01 B 0 100 10
(But still using zoo::rollapplyr
.) (但仍在使用
zoo::rollapplyr
。)
dat$stock <-
ave(dat$quantity * (dat$win > 0), dat$id, FUN = function(z) {
zoo::rollapplyr(z, 4, FUN = function(z) sum(z[-length(z)]),
by.column = FALSE, partial = TRUE)
})
library(data.table)
DT <- as.data.table(dat)
DT[, stock := zoo::rollapplyr(quantity * (win > 0), 4, FUN = function(z) sum(z[-length(z)]),
by.column = FALSE, partial = TRUE),
by = .(id) ]
(I'm using DT <- as.data.table(dat)
here in case the user chooses to use this without realizing the in-place change setDT(dat)
does to the data. I believe that setDT(dat)
is the canonical and recommended way to convert from another frame into data.table
-class, use that if you're going to make the switch from "tbl_df"
.) (我在这里使用
DT <- as.data.table(dat)
以防用户选择使用它而没有意识到setDT(dat)
对数据所做的就地更改。我相信setDT(dat)
是从另一个框架转换为data.table
类的规范和推荐方法,如果您要从"tbl_df"
进行切换,请使用它。)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.