简体   繁体   中英

R expand.grid but with prohibitions

I have a list of items v1, v2, etc. Important: its length is NOT known in advance!!

Each item (v1, v2, etc.) shows possible levels: 1:2, 1:3 etc. I need to create a data frame with all possible combinations of levels for items v1, v2, etc. I could do it quite effectively using 'expand.grid':

mylist <- list(v1 = 1:3, v2 = 1:3, v3 = 1:3)
combos <- expand.grid(mylist, KEEP.OUT.ATTRS = FALSE)

But here come complications. Some levels of some items are prohibited from appearing together, eg v2 == 1 and v3 == 1 cannot be combined. Below is how I would define such prohibitions (two prohibitions here):

prohibitions = data.frame(item1 = c("v1", "v2"), Level1 = c(1,2),
                          item2 = c("v3", "v3"), Level2 = c(1,3),
                          stringsAsFactors = FALSE)
prohibitions

Of course, I could take my result of expand.grid ('combos') with all possible combinations of item levels, and then remove rows that contain prohibitions:

for(row in 1:nrow(prohibitions)){
  item1 <- prohibitions[row, 'item1']
  item1_level <- prohibitions[row, 'Level1']
  item2 <- prohibitions[row, 'item2']
  item2_level <- prohibitions[row, 'Level2']
  # Removing rows that contain prohibited combinations:
  combos <- combos[!(combos[[item1]] == item1_level &
                       combos[[item2]] == item2_level), ]
}

However, I am not sure it is an effective way of doing it. Mainly because when 'mylist' is long and some attributes have a lot of levels, 'combos' will become super-huge. Thus, I thought it might be better to build 'combos' 'on the fly' - WHILE TAKING INTO ACCOUNT THE PROHIBITIONS. But then, it seems, like I would have to build a very long loop through all the items. Problems with that:

  1. I don't know how to write a loop through a bunch of items (v1, v2, etc.) - when I don't know in advance how many of them are there.
  2. Loops in R are slow.

Or maybe there is a way in R to build iterators or stacks like in Python so that I could build my 'combos' as a stack and then evaluate each row one at a time?

Any advice or is the solution I proposed above the only reasonable one? Thank you very much!

Have you tried the wildcard package ? It's like expand.grid() , but a bit more flexible.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM