In every ID
group, I only want to flag those years, that have n
years of experience (past data) AND also have one future year. So for example 2020, would always get 0
, because there is no `2021 in the data.
ID <- c(rep("A5", 15), rep("B2", 15))
product <- rep(rep(c("prod1","prod2","prod3", "prod55", "prod4", "prod9", "prod83"),3),2)
# start <- c(rep("01.01.2016", 3), rep("01.01.2015", 3), rep("01.01.2014",3),
# rep("01.01.2013",3), rep("01.01.2012",3))
start <- rep(c(rep(2016, 3), rep(2017, 3), rep(2018 ,3),
rep(2019,3), rep(2020,3)),2)
prodID <- rep(c(3,1,2,3,1,2,3,1,2,3,2,1,3,1,2),2)
mydata <- cbind(ID, product[1:15], start, prodID)
mydata <- as.data.table(mydata)
so the result would be something like for n=3
:
ID V2 start result
1: A5 prod1 2016 0
2: A5 prod2 2016 0
3: A5 prod3 2016 0
4: A5 prod55 2017 0
5: A5 prod4 2017 0
6: A5 prod9 2017 0
7: A5 prod83 2018 1
8: A5 prod1 2018 1
9: A5 prod2 2018 1
10: A5 prod3 2019 1
11: A5 prod55 2019 1
12: A5 prod4 2019 1
13: A5 prod9 2020 0
14: A5 prod83 2020 0
15: A5 prod1 2020 0
16: B2 prod1 2016 0
17: B2 prod2 2016 0
18: B2 prod3 2016 0
19: B2 prod55 2017 0
20: B2 prod4 2017 0
21: B2 prod9 2017 0
22: B2 prod83 2018 1
23: B2 prod1 2018 1
24: B2 prod2 2018 1
25: B2 prod3 2019 1
26: B2 prod55 2019 1
27: B2 prod4 2019 1
28: B2 prod9 2020 0
29: B2 prod83 2020 0
30: B2 prod1 2020 0
We can use between
:
library(data.table)
n = 3
mydata[, result := +(between(start, min(start) + n - 1, max(start) - 1)), ID]
which returns
mydata
# ID V2 start result
# 1: A5 prod1 2016 0
# 2: A5 prod2 2016 0
# 3: A5 prod3 2016 0
# 4: A5 prod55 2017 0
# 5: A5 prod4 2017 0
# 6: A5 prod9 2017 0
# 7: A5 prod83 2018 1
# 8: A5 prod1 2018 1
# 9: A5 prod2 2018 1
#10: A5 prod3 2019 1
#11: A5 prod55 2019 1
#12: A5 prod4 2019 1
#13: A5 prod9 2020 0
#14: A5 prod83 2020 0
#15: A5 prod1 2020 0
#16: B2 prod1 2016 0
#17: B2 prod2 2016 0
#18: B2 prod3 2016 0
#19: B2 prod55 2017 0
#20: B2 prod4 2017 0
#21: B2 prod9 2017 0
#22: B2 prod83 2018 1
#23: B2 prod1 2018 1
#24: B2 prod2 2018 1
#25: B2 prod3 2019 1
#26: B2 prod55 2019 1
#27: B2 prod4 2019 1
#28: B2 prod9 2020 0
#29: B2 prod83 2020 0
#30: B2 prod1 2020 0
# ID V2 start result
between
returns a boolean TRUE
/ FALSE
value indicating if value is in the range between two values. Equivalent way would be:
mydata[, result := +(start >= min(start) + n - 1 & start <= max(start) - 1), ID]
+
converts the boolean values (TRUE/FALSE) to integer values (1/0).
data
Don't use cbind
while creating data, use data.frame
or data.table
directly.
mydata <- data.table(ID, product[1:15], start)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.