[英]data.table remove rows based on lag value by group
I have a data.table
in the following form: 我有一个
data.table
,格式如下:
DT <- data.table(tag = rep(c("A", "B"), each = 10),
value = c(0, 3, 3, 3, 0, 1, 1, 1, 3, 0,
0, 1, 3, 1, 0, 3, 0, 1, 1, 0))
> DT
tag value
1: A 0
2: A 3
3: A 3
4: A 3
5: A 0
6: A 1
7: A 1
8: A 1
9: A 3
10: A 0
11: B 0
12: B 1
13: B 3
14: B 1
15: B 0
16: B 3
17: B 0
18: B 1
19: B 1
20: B 0
I would like to remove all the rows that have value of 3 but only those follow a 0. That is I would like to remove row 2, 3, 4 and row 16, but need to keep row 9 and row 13. 我想删除所有值为3但仍然只有0的行。这是我想删除第2,3,4和16行,但需要保留第9行和第13行。
Is there is a way to perform this? 有办法执行此操作吗?
A possible solution: 可能的解决方案:
DT[, `:=` (threes = rleid(value==3), apz = value == 3 & shift(value) == 0)
][, if (all(!apz)) .SD, by = threes
][, c('threes','apz') := NULL]
which gives: 这使:
tag value
1: A 0
2: A 0
3: A 1
4: A 1
5: A 1
6: A 3
7: A 0
8: B 0
9: B 1
10: B 3
11: B 1
12: B 0
13: B 0
14: B 1
15: B 1
16: B 0
DT[, prev.value := shift(value), by = tag][
, prev.value := prev.value[1], by = .(tag, rleid(value))][
!(value == 3 & prev.value == 0)]
# tag value prev.value
# 1: A 0 NA
# 2: A 0 3
# 3: A 1 0
# 4: A 1 0
# 5: A 1 0
# 6: A 3 1
# 7: A 0 3
# 8: B 0 NA
# 9: B 1 0
#10: B 3 1
#11: B 1 3
#12: B 0 1
#13: B 0 3
#14: B 1 0
#15: B 1 0
#16: B 0 1
Here's a one-liner of sorts (props to @Procrastinatus for the improvement): 这里有各种各样的东西(@Procrastinatus的改进道具):
DT[setDT(rle(value))[, rep(!( values==3 & shift(values)==0 ), lengths)] ]
To understand how it works, try running DT[, setDT(rle(value))]
, showing how R summarizes runs of sequential values, and read ?rle
. 要了解它是如何工作的,请尝试运行
DT[, setDT(rle(value))]
,显示R如何汇总顺序值的运行,并读取?rle
。
My original approach was: 我最初的做法是:
DT[ rleid(value) %in% setDT(rle(value))[ , .I[!( values==3 & shift(values)==0 )]] ]
Try DT[, rleid(value)]
and read ?rleid
for details. 试试
DT[, rleid(value)]
并阅读?rleid
了解详情。 This second approach is worse because the runs are evaluated twice (using both rle
and rleid
). 因为运行被两次评估(同时使用第二种方法更糟糕的是
rle
和rleid
)。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.