简体   繁体   中英

With RcppAlgos (R), is there a function to obtain the index of the elements in the resulting combinations?

I just discovered RcppAlgos and love it for its efficiency. I'm fairly new to combinatorics, but am wondering how I might go about solving the below example problem.

I have a vector of prices and a vector of items from a grocery store. I want to get all combinations of items that equal a price of 7.

price <- c(1,2,2,3,4,5,6,7)
items <- c("apple","orange","banana","watermelon","coffee","steak","milk","yogurt")

When running the below function to get all combinations that equal 7, I get the returned matrix:

comboGeneral(price, 2, constraintFun = "sum", comparisonFun = "==", limitConstraints = 7)

      [,1] [,2]
[1,]    1    6
[2,]    2    5
[3,]    2    5
[4,]    3    4

I would like to tie these back to their items, so is there a way to get a vector of index values returned that would allow me to merge back to the items vector? Or another function that would efficiently perform this task? Where I'm stumbling is that some items might have the same price, making the merge/join/match more challenging to perform.

For example, instead of returning a matrix of values that meet the constraint, is it possible to return the index values for these? Like the below:

      [,1] [,2]
[1,]    1    7
[2,]    2    6
[3,]    3    6
[4,]    4    5

This way I could then produce a matrix of:

       [,1]          [,2]
[1,]   "apple"       "milk"
[2,]   "orange"      "steak"
[3,]   "banana"      "steak"
[4,]   "watermelon"  "coffee"

Everything I have tried up to this point has led me to utilizing joins with a dataframe of values. There might be an easier approach than what I outlined above, but the data I am working with results in hundreds of thousands of combinations, so I'm working toward efficiency and it's incredible how fast RcppAlgos is. Any help is appreciated.

I am unaware of an approach with to efficiently get back to the original index, especially if there are repeats in the price vector.

However, we can try to do similar with . It's very possible that this is inadequate for your needs as most of the memory will need to be pre-allocated.

library(data.table)
dt = data.table(price, items, id = seq_along(price), key = 'id')

dt[dt,
   on = .(id < id),
   .(total_price = price + i.price,
     fruit1 = items,
     fruit2 = i.items),
   allow.cartesian = TRUE
   ][total_price == 7L, .(fruit1, fruit2)]

The idea is that we want to filter as much as we can to minimize the memory allocation. This is why we utilize a non-equi join, which will give us half of the results.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM