简体   繁体   中英

Get the first to the last element of each group in the form of a datatable using only data.table operations

I have the following datatable:

datatable_example <- data.table(a = c(1,1,1,1,2,2,2,3,3,3,3,4,4), b = c('A', 'B', 'B', 'A', 'B', 'B', 'A', 'A', 'B', 'A', 'A', 'A', 'A'))  

 > datatable_example
    a b
 1: 1 A
 2: 1 B
 3: 1 B
 4: 1 A
 5: 2 B
 6: 2 B
 7: 2 A
 8: 3 A
 9: 3 B
10: 3 A
11: 3 A
12: 4 A
13: 4 A

I would like to filter this datatable in a way that, for each column "a" it keeps all column "b" elements until the last letter "B". So the desired output is:

> output
    a b
 1: 1 A
 2: 1 B
 3: 1 B
 4: 2 B
 5: 2 B
 6: 3 A
 7: 3 B

Do you know anyway I can do this using data.table? I would not like to separate in 3 other datatables (using something like lapply) and then rbind or rbindlist them.

Here is an option:

DT[, rn := .I][
    DT[CJ(a, b="B", unique=TRUE), on=.(a, b), mult="last"],
    on=.(a, rn<=rn)]

output:

   a b rn i.b
1: 1 A  3   B
2: 1 B  3   B
3: 1 B  3   B
4: 2 B  6   B
5: 2 B  6   B
6: 3 A  9   B
7: 3 B  9   B

data:

DT <- data.table(a = c(1,1,1,1,2,2,2,3,3,3,3,3,3), 
    b = c('A', 'B', 'B', 'A', 'B', 'B', 'A', 'A', 'B', 'A', 'A', 'A', 'A'))    

Select rows until last "B" value in each group.

library(data.table)
datatable_example[, .SD[seq_len(max(which(b == 'B')))], a]

#   a b
#1: 1 A
#2: 1 B
#3: 1 B
#4: 2 B
#5: 2 B
#6: 3 A
#7: 3 B

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM