[英]Generating test data in R
I am trying to generate this table as one of the inputs to a test.我正在尝试生成此表作为测试的输入之一。
id diff d
1: 1 2 2020-07-31
2: 1 1 2020-08-01
3: 1 1 2020-08-02
4: 1 1 2020-08-03
5: 1 1 2020-08-04
6: 2 2 2020-07-31
7: 2 1 2020-08-01
8: 2 1 2020-08-02
9: 2 1 2020-08-03
10: 2 1 2020-08-04
11: 3 2 2020-07-31
12: 3 1 2020-08-01
13: 3 1 2020-08-02
14: 3 1 2020-08-03
15: 3 1 2020-08-04
16: 4 2 2020-07-31
17: 4 1 2020-08-01
18: 4 1 2020-08-02
19: 4 1 2020-08-03
20: 4 1 2020-08-04
21: 5 2 2020-07-31
22: 5 1 2020-08-01
23: 5 1 2020-08-02
24: 5 1 2020-08-03
25: 5 1 2020-08-04
id diff d
I have done it like this -我已经这样做了-
input1 = data.table(id=as.character(1:5), diff=1)
input1 = input1[,.(d=seq(as.Date('2020-07-31'), by='days', length.out = 5)),.(id, diff)]
input1[d == '2020-07-31']$diff = 2
diff
is basically the number of days to the next weekday. diff
基本上是到下一个工作日的天数。 Eg.例如。
31st Jul 2020
is Friday
. 31st Jul 2020
是Friday
。 Hence diff is 2 which is the diff to the next weekday, Monday
.因此 diff 是 2 ,这是到下一个工作日
Monday
的差异。 For the others it will be 1.对于其他人,它将是 1。
I personally dont like that I had to generate the date sequence for each of the ids separately or the hardcoding of the diff that I have to do in the input for 31st July.我个人不喜欢我必须分别为每个 id 生成日期序列,或者我必须在 7 月 31 日的输入中对差异进行硬编码。 Is there a more generic way of doing this without the hardcoding?
在没有硬编码的情况下,有没有更通用的方法来做到这一点?
We can create all combination of dates and id
using crossing
and create diff
column based on whether the weekday is "Friday"
.我们可以使用
crossing
创建日期和id
的所有组合,并根据工作日是否为"Friday"
创建diff
列。
library(dplyr)
tidyr::crossing(id = 1:5, d = seq(as.Date('2020-07-31'),
by='days', length.out = 5)) %>%
mutate(diff = as.integer(weekdays(d) == 'Friday') + 1)
Similar logic using base R expand.grid
:使用基础 R
expand.grid
类似逻辑:
transform(expand.grid(id = 1:5,
d = seq(as.Date('2020-07-31'), by='days', length.out = 5)),
diff = as.integer(weekdays(d) == 'Friday') + 1)
and CJ
in data.table
:和
CJ
在data.table
:
library(data.table)
df <- CJ(id = 1:5, d = seq(as.Date('2020-07-31'), by='days', length.out = 5))
df[, diff := as.integer(weekdays(d) == 'Friday') + 1]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.