简体   繁体   English

这个算法真的有效吗? 子集和回溯算法

[英]Does this algorithm actually work? Sum of subsets backtracking algorithm

I want to know if this backtracking algorithm actually works.我想知道这个回溯算法是否真的有效。

In the text book Foundations of Algorithms , 5 th edition, it is defined as follows:在教科书《 算法基础》5 版中,定义如下:

Algorithm 5.4: The Backtracking Algorithm for the Sum-of-Subsets Problem算法 5.4:子集和问题的回溯算法

Problem: Given n positive integers (weights) and a positive integer W , determine all combinations of the integers that sum up to W .问题:给定n 个正整数(权重)和一个正 integer W ,确定总和为W的整数的所有组合。

Inputs: positvie integer n , sorted (nondecreasing order) array of positive integers w indexed from 1 to n , and a positive integer W .输入: positvie integer n ,从 1 到n索引的正整数w的排序(非降序)数组,以及一个正 integer W

Outputs: all combinations of the integers that sum to W .输出:总和为W的所有整数组合。

 void sum_of_subsets(index i, int weight, int total) { if (promising(i)) if (weight == W) cout << include[1] through include [i]; else { include[i + 1] = "yes"; // Include w[i + 1]. sum_of_subsets(i + 1, weight + w[i + 1], total - w[i + 1]); include[i + 1] = "no"; // Do not include w[i + 1]. sum_of_subsets(i + 1, weight, total - w[i + 1]); } } bool promising (index i); { return (weight + total >= W) && (weight == W || weight + w[i + 1] <= W); }

Following our usual convention, n, w, W, and include are not inputs to our routines.按照我们通常的约定, n、w、Winclude不是我们例程的输入。 If these variables were defined globally, the top-level call to sum_of_subsets would be as follows:如果这些变量是全局定义的,那么对sum_of_subsets的顶级调用如下:

 sum_of_subsets(0, 0, total);

At the end of chapter 5, exercise 13 asks:在第 5 章的最后, 练习 13要求:

  1. Use the Backtracking algorithm for the Sum-of-Subsets problem (Algorithm 5.4) to find all combinations of the following numbers that sum to W = 52:使用子集和问题的回溯算法(算法 5.4)找出以下数字的总和为W = 52 的所有组合:

    w1 = 2 w2 = 10 w3 = 13 w4 = 17 w5 = 22 w6 = 42 w1 = 2 w2 = 10 w3 = 13 w4 = 17 w5 = 22 w6 = 42

I've implemented this exact algorithm, accounting for arrays that start at 1 and it just does not work...我已经实现了这个精确的算法,考虑到从 1 开始的 arrays 并且它不起作用......

 void sos(int i, int weight, int total) {
    int yes = 1;
    int no = 0;

    if (promising(i, weight, total)) {
        if (weight == W) {
            for (int j = 0; j < arraySize; j++) {
                std::cout << include[j] << " ";
            }
            std::cout << "\n";
        }
        else if(i < arraySize) {
            include[i+1] = yes;
            sos(i + 1, weight + w[i+1], total - w[i+1]);
            include[i+1] = no;
            sos(i + 1, weight, total - w[i+1]);
        }
    }
}


int promising(int i,  int weight, int total) {
    return (weight + total >= W) && (weight == W || weight + w[i+1] <= W);
}   

I believe the problem is here:我相信问题出在这里:

sos(i + 1, weight, total - w[i+1]);
sum_of_subsets(i+1, weight, total-w[i+1]);

When you reach this line you are not backtracking correctly.当你到达这条线时,你没有正确回溯。

Is anyone able to identify a problem with this algorithm or actually code it to work?是否有人能够识别此算法的问题或实际对其进行编码以使其工作?

I personally find the algorithm problematic.我个人觉得算法有问题。 There is no bounds checking, it uses a lot of globals, and it assumes an array is indexed from 1. I don't think you can copy it verbatim.没有边界检查,它使用了很多全局变量,并且假设数组从 1 开始索引。我认为您不能逐字复制它。 It's pseudocode for the actual implementation.它是实际实现的伪代码 In C++ arrays always start from 0. So you're likely to have problems when you try do include[i+1] and you are only checking i < arraySize .在 C++ arrays 总是从 0 开始。所以当你尝试做include[i+1]并且你只检查i < arraySize arraySize 时你可能会遇到问题。

The algorithm also assumes you have a global variable called total , which is used by the function promising .该算法还假设您有一个名为total的全局变量,它被promising所使用。

I have reworked the code a bit, putting it inside a class, and simplified it somewhat:我对代码进行了一些修改,将其放入 class 中,并对其进行了一些简化:

class Solution
{
private:
    vector<int> w;
    vector<int> include;

public:
    Solution(vector<int> weights) : w(std::move(weights)),
        include(w.size(), 0) {}

    void sos(int i, int weight, int total) {
        int yes = 1;
        int no = 0;
        int arraySize = include.size();

        if (weight == total) {
            for (int j = 0; j < arraySize; j++) {
                if (include[j]) {
                    std::cout << w[j] << " ";
                }
            }
            std::cout << "\n";
        }
        else if (i < arraySize)
        {
            include[i] = yes;
            //Include this weight
            sos(i + 1, weight + w[i], total);
            include[i] = no;
            //Exclude this weight
            sos(i + 1, weight, total);
        }
    }
};

int main()
{   
    Solution solution({ 2, 10, 13, 17, 22, 42 });
    solution.sos(0, 0, 52);
    //prints:    10 42
    //           13 17 22
}

So yes, as others pointed out, you stumbled over the 1 based array index.所以是的,正如其他人指出的那样,您偶然发现了基于 1 的数组索引。

That aside, I think you should ask the author for a partial return of the money you paid for the book, because the logic of his code is overly complicated.除此之外,我认为您应该要求作者退还您为这本书支付的部分费用,因为他的代码逻辑过于复杂。

One good way not to run into bounds problems is to not use C++ (expecting hail of downvotes for this lol).不遇到边界问题的一种好方法是不使用 C++ (期待这个大声笑的反对票)。

There are only 3 cases to test for:只有3个案例需要测试:

  • The candidate value is greater than what is remaining.候选值大于剩余值。 (busted) (破获)
  • The candidate value is exactly what is remaining.候选值正是剩余的值。
  • The candidate value is less than what is remaining.候选值小于剩余值。

The promising function tries to express that and then the result of that function is re-tested again in the main function sos . promising的 function 试图表达这一点,然后 function 的结果在主 function sos中再次重新测试。

But it could look as simple as this:但它可能看起来像这样简单:

search :: [Int] -> Int -> [Int] -> [[Int]]
search (x1:xs) t path 
    | x1 > t = []
    | x1 == t = [x1 : path]
    | x1 < t = search xs (t-x1) (x1 : path) ++ search xs t path
search [] 0 path = [path]
search [] _ _ = []


items = [2, 10, 13, 17, 22, 42] :: [Int]
target = 52 :: Int

search items target []
-- [[42,10],[22,17,13]]

Now, it is by no means impossible to achieve a similar safety net while writing C++ code.现在,在编写 C++ 代码时实现类似的安全网并非不可能。 But it takes determination and a conscious decision on what you are willing to cope with and what not.但它需要决心和有意识地决定你愿意应对什么,什么不愿意。 And you need to be willing to type a few more lines to accomplish what the 10 lines of Haskell do.而且您需要愿意再输入几行代码才能完成 Haskell 的 10 行代码所做的事情。

First off, I was bothered by all the complexity of indexing and range checking in the original C++ code.首先,我对原始 C++ 代码中索引和范围检查的所有复杂性感到困扰。 If we look at our Haskell code (which works with lists), it is confirmed that we do not need random access at all.如果我们查看我们的 Haskell 代码(适用于列表),可以确认我们根本不需要随机访问。 We only ever look at the start of the remaining items.我们只看剩余项目的开始。 And we append a value to the path (in Haskell we append to the front because speed) and eventually we append a found combination to the result set.而我们 append 一个值到路径(在 Haskell 我们 append 到前面因为速度),最终我们找到了 Z9516DFB146F51C7EE1DA9 的组合。 With that in mind, bothering with indices is kind of over the top.考虑到这一点,对索引的困扰有点过头了。

Secondly, I rather like the way the search function looks - showing the 3 crucial tests without any noise surrounding them.其次,我更喜欢搜索 function 的外观 - 显示 3 个关键测试,周围没有任何噪音。 My C++ version should strive to be as pretty.我的 C++ 版本应该努力做到漂亮。

Also, global variables are so 1980 - we won't have that.此外,全局变量是如此 1980 - 我们不会有那个。 And tucking those "globals" into a class to hide them a bit is so 1995. We won't have that either.将这些“全局变量”塞入 class 以稍微隐藏它们是 1995 年的事。我们也不会这样做。

And here it is.就在这里。 The "safer" C++ implementation. “更安全”的 C++ 实现。 And prettier... um..;而且更漂亮……嗯……; well some of you might disagree ;)好吧,你们中的一些人可能不同意;)

#include <cstdint>
#include <vector>
#include <iostream>

using Items_t = std::vector<int32_t>;
using Result_t = std::vector<Items_t>;

// The C++ way of saying: deriving(Show)
template <class T>
std::ostream& operator <<(std::ostream& os, const std::vector<T>& value) 
{
    bool first = true;
    os << "[";
    for( const auto item : value) 
    {
        if(first) 
        {
            os << item;
            first = false;
        }
        else
        {
            os << "," << item;
        }
    }
    os << "]";
    return os;
}

// So we can do easy switch statement instead of chain of ifs.
enum class Comp : int8_t 
{   LT = -1
,   EQ = 0
,   GT = 1
};

static inline 
auto compI32( int32_t left, int32_t right ) -> Comp
{
    if(left == right) return Comp::EQ;
    if(left < right) return Comp::LT;
    return Comp::GT;
}

// So we can avoid index insanity and out of bounds problems.
template <class T>
struct VecRange
{
    using Iter_t = typename std::vector<T>::const_iterator;
    Iter_t current;
    Iter_t end;
    VecRange(const std::vector<T>& v)
        : current{v.cbegin()}
        , end{v.cend()}
    {}
    VecRange(Iter_t cur, Iter_t fin)
        : current{cur}
        , end{fin}
    {}
    static bool exhausted (const VecRange<T>&);
    static VecRange<T> next(const VecRange<T>&);
};

template <class T>
bool VecRange<T>::exhausted(const VecRange<T>& range)
{
    return range.current == range.end;
}

template <class T>
VecRange<T> VecRange<T>::next(const VecRange<T>& range)
{
    if(range.current != range.end)
        return VecRange<T>( range.current + 1, range.end );
    return range;   
}

using ItemsRange = VecRange<Items_t::value_type>;


static void search( const ItemsRange items, int32_t target, Items_t path, Result_t& result)
{
    if(ItemsRange::exhausted(items))
    {
        if(0 == target)
        {
            result.push_back(path);
        }
        return;
    }

    switch(compI32(*items.current,target))
    {
        case Comp::GT: 
            return;
        case Comp::EQ:
            {
                path.push_back(*items.current);
                result.push_back(path);
            }
            return;
        case Comp::LT:
            {
                auto path1 = path; // hope this makes a real copy...
                path1.push_back(*items.current);
                search(ItemsRange::next(items), target - *items.current, path1, result);
                search(ItemsRange::next(items), target, path, result);
            }
            return;
    }
}

int main(int argc, const char* argv[])
{
    Items_t items{ 2, 10, 13, 17, 22, 42 };
    Result_t result;
    int32_t target = 52;

    std::cout << "Input: "  << items << std::endl;
    std::cout << "Target: " << target << std::endl;
    search(ItemsRange{items}, target, Items_t{}, result);
    std::cout << "Output: " << result << std::endl;
    return 0;
}

The code implements the algorithm correctly, except that you did not apply the one-based array logic in your output loop.代码正确地实现了算法,除了您没有在 output 循环中应用基于 1 的数组逻辑。 Change:改变:

for (int j = 0; j < arraySize; j++) {
    std::cout << include[j] << " ";
}

to:至:

for (int j = 0; j < arraySize; j++) {
    std::cout << include[j+1] << " ";
}

Depending on how you organised your code, make sure that promising is defined when sos is defined.根据您组织代码的方式,确保在定义promising时定义了sos

See it run on repl.it .看到它在repl.it上运行。 Output: Output:

0 1 0 0 0 1
0 0 1 1 1 0

The algorithm works fine: the second and third argument to the sos function act as a window in which the running sum should stay, and the promising function verifies against this window. The algorithm works fine: the second and third argument to the sos function act as a window in which the running sum should stay, and the promising function verifies against this window. Any value outside this window will be either to small (even if all remaining values were added to it, it will still be less than the target value), or too great (already overrunning the target).此 window 之外的任何值要么太小(即使将所有剩余值都添加到它,它仍然会小于目标值),或者太大(已经超出目标值)。 These two constraints are explained in the beginning of chapter 5.4 in the book.这两个约束在本书第 5.4 章的开头进行了解释。

At each index there are two possible choices: either include the value in the sum, or don't.在每个索引处,有两种可能的选择:要么在总和中包含值,要么不包含。 The value at includes[i+1] represents this choice, and both are attempted. includes[i+1]处的值表示此选择,并且两者都被尝试。 When there is a match deep down such recursing attempt, all these choices (0 or 1) will be output.当在这种递归尝试的深处存在匹配时,所有这些选择(0 或 1)将是 output。 Otherwise they are just ignored and switched to the opposite choice in a second attempt.否则,它们将被忽略并在第二次尝试中切换到相反的选择。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM