简体   繁体   中英

Check which rows in a data.table are identical

I need a solution that shows me which rows are identical but I can't find a clever solution (a solution without a bunch of complex loops). I would prefer a data.table solution. What I want to have is a list with line numbers that have the identical entries.

An example:

library(data.table)
Data <- data.table(A = c("a", "a", "c"), 
                   B = c("A", "A", "B"))

The first and the second line are identical. My desired output:

[[1]]
[1] 1 2

[[2]]
[1] 3

Here is something quick and dirty:

Data[, .(.I, .GRP), by = .(A, B)][, list(split(I, GRP))]$V1

Could be simplified to:

Data[, .(list(.I)), by = .(A, B)]$V1

That was my solution until sindri_baldur came up with a better solution:

Data.unique <- unique(Data)
Data.unique[, G := .I]
Data[, I := .I]
Data.full <- 
  merge(Data,
        Data.unique,
        by = c("A", "B"))

Data.full %>% 
  split(by = "G") %>% 
  map(~ .x[, I])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM