简体   繁体   English

在 R 中复制 excel 求解器

[英]Replicate excel solver in R

I am wondering if there is a simple function to solve the following problem in R:我想知道是否有一个简单的函数可以解决 R 中的以下问题:

Suppose I have the following dataframe:假设我有以下数据框:

  • Variable 'A' with values c(10, 35, 90)具有值 c(10, 35, 90) 的变量 'A'
  • Variable 'B' with values c(3, 4, 17, 18, 50, 40, 3)变量 'B' 的值为 c(3, 4, 17, 18, 50, 40, 3)

Now I know that the sum of various values in B equal the values in A, eg '3 + 4 + 3 = 10' and '17 + 18 = 35', which always balances out in the complete dataset.现在我知道 B 中各种值的总和等于 A 中的值,例如“3 + 4 + 3 = 10”和“17 + 18 = 35”,它们在完整数据集中总是平衡的。

Question

Is there a function that can sum these values in B, through trial and error I suppose, and match the correctly summed values with A?是否有一个函数可以在 B 中对这些值求和,我想通过反复试验,并将正确求和的值与 A 相匹配? For example, the function tries to sum 3 + 4 + 18, which is 25 and retries this because 25 is not a value in A.例如,该函数尝试对 3 + 4 + 18 求和,即 25,然后重试,因为 25 不是 A 中的值。

I have tried several solutions myself but one problem that I often encountered was the fact that A always has less observations than B.我自己尝试了几种解决方案,但我经常遇到的一个问题是 A 的观察值总是比 B 少。

I would be very thankful if someone can help me out here!如果有人能在这里帮助我,我将不胜感激! If more info is needed please let me know.如果需要更多信息,请告诉我。

Cheers,干杯,

Daan大安

Edit编辑

This example is with simplified numbers.这个例子是简化的数字。 In reality, it is a large dataset, so I am looking for a scalable solution.实际上,它是一个大型数据集,因此我正在寻找可扩展的解决方案。

Thanks again!再次感谢!

This is a problem know as the subset sum problem , and there are a ton of examples online of how to solve it using dynamic programming, or greedy algorithms.这是一个称为子集求和问题的问题,网上有大量关于如何使用动态规划或贪婪算法解决它的示例。

To give you an answer that just works, the package adagio has an implementation:为了给你一个有效的答案,包adagio有一个实现:

library(adagio)

sums = c(10, 35, 90)
values = c(3, 4, 17, 18, 50, 40, 3)


for(i in sums){
  #we have to subset the values to be less than the value
  #otherwise the function errors:
  print(subsetsum(values[values < i], i))
}

The output for each sum is a list, with the val and the indices in the array, so you can tidy up the output depending on what you want from there.每个总和的输出是一个列表,其中包含数组中的 val 和索引,因此您可以根据需要整理输出。

You can try the following but I am affraid is not scalable.您可以尝试以下操作,但我担心不可扩展。 For the case of 3 summands you have对于 3 个被加数的情况,您有

x <- expand.grid(c(3, 4, 17, 18, 50, 40, 3),#building a matrix of the possible combinations of summands
                 c(3, 4, 17, 18, 50, 40, 3),
                 c(3, 4, 17, 18, 50, 40, 3))
x$sums <-rowSums(x) #new column with possible sums
idx<- x$sums%in%c(10, 35, 90) #checking the sums are in the required total
x[idx,]

        Var1 Var2 Var3 sums
2      4    3    3   10
8      3    4    3   10
14     3    4    3   10
44     4    3    3   10
50     3    3    4   10
56     3    3    4   10
92     3    3    4   10
98     3    3    4   10
296    4    3    3   10
302    3    4    3   10
308    3    4    3   10
338    4    3    3   10

For the case of 2 summands对于 2 个被加数的情况

x <- expand.grid(c(3, 4, 17, 18, 50, 40, 3),
                 c(3, 4, 17, 18, 50, 40,3))
x$sums <-rowSums(x) 
idx<- x$sums%in%c(10, 35, 90)
#Results
x[idx,]
   Var1 Var2 sums
18   18   17   35
24   17   18   35
34   40   50   90
40   50   40   90

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM