简体   繁体   中英

subset data.table based on key being NOT an element of a list

I have the following data.table:

DT = data.table(ID = c(1, 2, 4, 5, 10), A = c(13, 1, 13, 11, 12))

DT
   ID  A
1:  1 13
2:  2  1
3:  4 13
4:  5 11
5: 10 12

The contents of column A are not important. I have a list/vector test <- c(1, 5, 9, 10, 11, 12, ...) that can be many times longer than the data.table. I want to select the rows in the data.table DT such that the key ID is not present in the vector test :

    ID  A
2:  2  1
3:  4 13

I think that DT[!(ID %in% test)] works, but wanted to take advantage of the data.table fast key-based subsetting. Note that the vector test could possibly not have any elements in common with the key from DT , which would lead to the subset returning the data.table itself, and it could be that all keys are present in test , returning an empty data.table. Any suggestions?

What about:

library(data.table)
DT   <- data.table(ID = c(1, 2, 4, 5, 10), A = c(13, 1, 13, 11, 12))
test <- data.table(ID = c(1, 5, 9, 10, 11, 12))
setkey(test,ID)
DT[!test, on="ID"]

We can use %in% and negate ( ! )

DT[!ID %in% test]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM