简体   繁体   English

仅使用 data.table 操作以数据表的形式获取每个组的第一个到最后一个元素

[英]Get the first to the last element of each group in the form of a datatable using only data.table operations

I have the following datatable:我有以下数据表:

datatable_example <- data.table(a = c(1,1,1,1,2,2,2,3,3,3,3,4,4), b = c('A', 'B', 'B', 'A', 'B', 'B', 'A', 'A', 'B', 'A', 'A', 'A', 'A'))  

 > datatable_example
    a b
 1: 1 A
 2: 1 B
 3: 1 B
 4: 1 A
 5: 2 B
 6: 2 B
 7: 2 A
 8: 3 A
 9: 3 B
10: 3 A
11: 3 A
12: 4 A
13: 4 A

I would like to filter this datatable in a way that, for each column "a" it keeps all column "b" elements until the last letter "B".我想以一种方式过滤这个数据表,对于每一列“a”,它保留所有列“b”元素,直到最后一个字母“B”。 So the desired output is:所以所需的 output 是:

> output
    a b
 1: 1 A
 2: 1 B
 3: 1 B
 4: 2 B
 5: 2 B
 6: 3 A
 7: 3 B

Do you know anyway I can do this using data.table?你知道我可以用 data.table 做到这一点吗? I would not like to separate in 3 other datatables (using something like lapply) and then rbind or rbindlist them.我不想在 3 个其他数据表中分开(使用类似 lapply 的东西)然后 rbind 或 rbindlist 它们。

Here is an option:这是一个选项:

DT[, rn := .I][
    DT[CJ(a, b="B", unique=TRUE), on=.(a, b), mult="last"],
    on=.(a, rn<=rn)]

output: output:

   a b rn i.b
1: 1 A  3   B
2: 1 B  3   B
3: 1 B  3   B
4: 2 B  6   B
5: 2 B  6   B
6: 3 A  9   B
7: 3 B  9   B

data:数据:

DT <- data.table(a = c(1,1,1,1,2,2,2,3,3,3,3,3,3), 
    b = c('A', 'B', 'B', 'A', 'B', 'B', 'A', 'A', 'B', 'A', 'A', 'A', 'A'))    

Select rows until last "B" value in each group. Select 行,直到每组中的最后一个"B"值。

library(data.table)
datatable_example[, .SD[seq_len(max(which(b == 'B')))], a]

#   a b
#1: 1 A
#2: 1 B
#3: 1 B
#4: 2 B
#5: 2 B
#6: 3 A
#7: 3 B

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM