[英]Get the first to the last element of each group in the form of a datatable using only data.table operations
I have the following datatable:我有以下数据表:
datatable_example <- data.table(a = c(1,1,1,1,2,2,2,3,3,3,3,4,4), b = c('A', 'B', 'B', 'A', 'B', 'B', 'A', 'A', 'B', 'A', 'A', 'A', 'A'))
> datatable_example
a b
1: 1 A
2: 1 B
3: 1 B
4: 1 A
5: 2 B
6: 2 B
7: 2 A
8: 3 A
9: 3 B
10: 3 A
11: 3 A
12: 4 A
13: 4 A
I would like to filter this datatable in a way that, for each column "a" it keeps all column "b" elements until the last letter "B".我想以一种方式过滤这个数据表,对于每一列“a”,它保留所有列“b”元素,直到最后一个字母“B”。 So the desired output is:
所以所需的 output 是:
> output
a b
1: 1 A
2: 1 B
3: 1 B
4: 2 B
5: 2 B
6: 3 A
7: 3 B
Do you know anyway I can do this using data.table?你知道我可以用 data.table 做到这一点吗? I would not like to separate in 3 other datatables (using something like lapply) and then rbind or rbindlist them.
我不想在 3 个其他数据表中分开(使用类似 lapply 的东西)然后 rbind 或 rbindlist 它们。
Here is an option:这是一个选项:
DT[, rn := .I][
DT[CJ(a, b="B", unique=TRUE), on=.(a, b), mult="last"],
on=.(a, rn<=rn)]
output: output:
a b rn i.b
1: 1 A 3 B
2: 1 B 3 B
3: 1 B 3 B
4: 2 B 6 B
5: 2 B 6 B
6: 3 A 9 B
7: 3 B 9 B
data:数据:
DT <- data.table(a = c(1,1,1,1,2,2,2,3,3,3,3,3,3),
b = c('A', 'B', 'B', 'A', 'B', 'B', 'A', 'A', 'B', 'A', 'A', 'A', 'A'))
Select rows until last "B"
value in each group. Select 行,直到每组中的最后一个
"B"
值。
library(data.table)
datatable_example[, .SD[seq_len(max(which(b == 'B')))], a]
# a b
#1: 1 A
#2: 1 B
#3: 1 B
#4: 2 B
#5: 2 B
#6: 3 A
#7: 3 B
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.