简体   繁体   English

C#中的优化算法

[英]Optimization Algorithm in C#

I have an optimization issue that I'm not sure where to go from here. 我有一个优化问题,我不确定从这里出发。 I have a program that tries to find the best combination of inputs that return the highest predicted r squared value. 我有一个程序试图找到返回最高预测r平方值的输入的最佳组合。 The problem is that I have 21 total inputs (List) and I need them in a set of 15 inputs. 问题是我总共有21个输入(列表),并且需要一组15个输入。 The formula for total combinations is: 总组合的公式为:

n! N! / r!(n - r)! / r!(n-r)! = 21! = 21! / 15!(21 - 15)! / 15!(21-15)! = 54,264 possible combinations = 54,264种可能的组合

So obviously running through each combination and calculating the predicted rsquared is not an ideal solution so is there an better way/algorithm/method I can use to try to skip or narrow down the bad combinations so that I'm only processing the fewest amount of combinations? 因此很显然,遍历每种组合并计算预测的rsquared都不是理想的解决方案,因此是否有更好的方法/算法/方法可以用来尝试跳过或缩小不良组合,因此我只处理了最少的组合? Here is my current psuedo code for this issue: 这是我当前针对此问题的伪代码:

public BestCombo GetBestCombo(List<List<MultipleRegressionInfo>> combosList)
{
   BestCombo bestCombo = new BestCombo();

   foreach (var combo in combosList)
   {
      var predRsquared = CalculatePredictedRSquared(combo);

      if (predRsquared > bestCombo.predRSquared)
      {
         bestCombo.predRSquared = predRsquared;
         bestCombo.BestRSquaredCombo = combo;
      }
   }

   return bestCombo;
}

public class BestCombo
    {
        public double predRSquared { get; set; }
        public IEnumerable<MultipleRegressionInfo> BestRSquaredCombo { get; set; }
    }

public class MultipleRegressionInfo
{
    public List<double> input { get; set; }
    public List<double> output { get; set; }
}

public double CalculatePredictedRSquared(List<MultipleRegressionInfo> combo)
{
    Matrix<double> matrix = BuildMatrix(combo.Select(i => i.input).ToArray());
    Vector<double> vector = BuildVector(combo.ElementAt(0).output);
    var coefficients = CalculateWithQR(matrix, vector);
    var y = CalculateYIntercept(coefficients, input, output);
    var estimateList = CalculateEstimates(coefficients, y, input, output);
    return GetPredRsquared(estimateList, output);
}

54,264 is not enormous for a computer - it might be worth timing a few calls to compute R^2 and multiplying up to see just how long this would take. 54,264对于一台计算机而言并不是很大-值得安排一些调用来计算R ^ 2并相乘以了解这将花费多长时间。

There is a branch and bound algorithm for this sort of problem, which relies on the fact that R^2(A,B,C) >= R^2(A,B) - that the R^2 can only decrease when you drop a variable. 对于此类问题,存在分支定界算法,它依赖于以下事实:R ^ 2(A,B,C)> = R ^ 2(A,B)-仅当您删除一个变量。 Recursively search the space of all sets of variables of size at least 15. After computing the R^2 for a set of variables, make recursive calls with sets produced by dropping a single variable from the set, where any such drop must be to the right of any existing gap (so A.CDE produces A..DE, ACE, and A.CD. but not ..CDE, which will be produced by .BCDE). 递归搜索所有大小至少为15的变量集的空间。在为一组变量计算R ^ 2之后,对通过从集合中删除单个变量而产生的集合进行递归调用,其中任何这样的丢弃都必须降到保留现有间隙的权利(因此A.CDE会产生A..DE,ACE和A.CD,但不会产生..CDE,这将由.BCDE产生)。 You can terminate the recursion when you get down to the desired size of set, or when you find an R^2 that is no better than the best answer so far. 当您降低到所需的集合大小时,或者当您发现R ^ 2并不比目前的最佳答案更好时,您可以终止递归。

If it happens that you often find R^2 values no better than the best answer so far, this will save time - but this is not guaranteed. 如果碰巧您经常发现R ^ 2值并不比目前的最佳答案更好,这将节省时间-但这不能保证。 You can attempt to improve the efficiency by chosing to investigate the sets with highest R^2 first, hoping that you find a new best answer good enough to rule out their siblings by the time you come to them, and by using a procedure to calculate R^2 for A.CDE that makes use of the calculations you have already done for ABCDE. 您可以尝试通过选择首先调查具有最高R ^ 2的集合来提高效率,希望您找到一个新的最佳答案,足以在遇到它们之前排除它们的同胞,并使用一个过程来计算A.CDE的R ^ 2利用您已经对ABCDE进行的计算。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM