简体   繁体   中英

Exclude rows where element has been previously met for N times

I have following input data:

#     [,1] [,2]
#[1,] "A"  "B" 
#[2,] "A"  "C" 
#[3,] "A"  "D" 
#[4,] "B"  "C" 
#[5,] "B"  "D" 
#[6,] "C"  "D" 

Next I want to exclude rows where first or second element has been previously for N times. For example if N = 2 then need to exclude following rows:

#[3,] "A"  "D" - element "A" has been 2 times
#[5,] "B"  "D" - element "B" has been 2 times
#[6,] "C"  "D" - element "C" has been 2 times

Note: Need to take into account excluding results immediately. For example if element has met 5 times and after removing it met only 1 times then need to leave next row with this element. Because now it meets 2 times.

Example (N=2):

Input data:

      [,1] [,2]
 [1,] "A"  "B" 
 [2,] "A"  "C" 
 [3,] "A"  "D" 
 [4,] "A"  "E" 
 [5,] "B"  "C" 
 [6,] "B"  "D" 
 [7,] "B"  "E" 
 [8,] "C"  "D" 
 [9,] "C"  "E" 
[10,] "D"  "E"  

Output data:

     [,1] [,2]
 [1,] "A"  "B" 
 [2,] "A"  "C" 
 [5,] "B"  "C" 
[10,] "D"  "E" 

There are possibly more elegant solutions... but this seems to work:

v <- c("A", "B", "C", "D", "E")
cmb <- t(combn(v, 2))

n <- 2

# Go through each letter
for (l in v)
  {
  # Find the combinations using that letter 
  rows <- apply(cmb, 1, function(x){l %in% x})

  rows.2 <- which(rows==T)
  if (length(rows.2)>n)
    rows.2 <- rows.2[1:n]

  # Take the first n rows containing the letter,
  # then append all the ones not containing it
  cmb <- rbind(cmb[rows.2,], cmb[rows==F,])
  }

cmb

which outputs:

    [,1] [,2]
[1,] "D"  "E" 
[2,] "B"  "C" 
[3,] "A"  "C" 
[4,] "A"  "B" 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM