简体   繁体   中英

Combinations of N Boost interval_set

I have a service which has outages in 4 different locations. I am modeling each location outages into a Boost ICL interval_set. I want to know when at least N locations have an active outage.

Therefore, following this answer , I have implemented a combination algorithm, so I can create combinations between elemenets via interval_set intersections.

Whehn this process is over, I should have a certain number of interval_set, each one of them defining the outages for N locations simultaneusly, and the final step will be joining them to get the desired full picture.

The problem is that I'm currently debugging the code, and when the time of printing each intersection arrives, the output text gets crazy (even when I'm using gdb to debug step by step), and I can't see them, resulting in a lot of CPU usage.

I guess that somehow I'm sending to output a larger portion of memory than I should, but I can't see where the problem is.

This is a SSCCE:

#include <boost/icl/interval_set.hpp>
#include <algorithm>
#include <iostream>
#include <vector>


int main() {
    // Initializing data for test
    std::vector<boost::icl::interval_set<unsigned int> > outagesPerLocation;
    for(unsigned int j=0; j<4; j++){
        boost::icl::interval_set<unsigned int> outages;
        for(unsigned int i=0; i<5; i++){
            outages += boost::icl::discrete_interval<unsigned int>::closed(
                (i*10), ((i*10) + 5 - j));
        }
        std::cout << "[Location " << (j+1) << "] " << outages << std::endl;
        outagesPerLocation.push_back(outages);
    }

    // So now we have a vector of interval_sets, one per location. We will combine
    // them so we get an interval_set defined for those periods where at least
    // 2 locations have an outage (N)
    unsigned int simultaneusOutagesRequired = 2;  // (N)

    // Create a bool vector in order to filter permutations, and only get
    // the sorted permutations (which equals the combinations)
    std::vector<bool> auxVector(outagesPerLocation.size());
    std::fill(auxVector.begin() + simultaneusOutagesRequired, auxVector.end(), true);

    // Create a vector where combinations will be stored
    std::vector<boost::icl::interval_set<unsigned int> > combinations;

    // Get all the combinations of N elements
    unsigned int numCombinations = 0;
    do{
        bool firstElementSet = false;
        for(unsigned int i=0; i<auxVector.size(); i++){
            if(!auxVector[i]){
                if(!firstElementSet){
                    // First location, insert to combinations vector
                    combinations.push_back(outagesPerLocation[i]);
                    firstElementSet = true;
                }
                else{
                    // Intersect with the other locations
                    combinations[numCombinations] -= outagesPerLocation[i];
                }
            }
        }
        numCombinations++;
        std::cout << "[-INTERSEC-] " << combinations[numCombinations] << std::endl;  // The problem appears here
    }
    while(std::next_permutation(auxVector.begin(), auxVector.end()));

    // Get the union of the intersections and see the results
    boost::icl::interval_set<unsigned int> finalOutages;
    for(std::vector<boost::icl::interval_set<unsigned int> >::iterator
        it = combinations.begin(); it != combinations.end(); it++){
        finalOutages += *it;
    }

    std::cout << finalOutages << std::endl;
    return 0;
}

Any help?

As I surmised , there's a "highlevel" approach here.

Boost ICL containers are more than just containers of "glorified pairs of interval starting/end points". They are designed to implement just that business of combining, searching, in a generically optimized fashion.

So you don't have to.

If you let the library do what it's supposed to do:

using TimePoint = unsigned;
using DownTimes = boost::icl::interval_set<TimePoint>;
using Interval  = DownTimes::interval_type;
using Records   = std::vector<DownTimes>;

Using functional domain typedefs invites a higher level approach. Now, let's ask the hypothetical "business question":

What do we actually want to do with our records of per-location downtimes?

Well, we essentially want to

  1. tally them for all discernable time slots and
  2. filter those where tallies are at least 2
  3. finally, we'd like to show the "merged" time slots that remain.

Ok, engineer: implement it!


  1. Hmm. Tallying. How hard could it be?

    ❕ The key to elegant solutions is the choice of the right datastructure

     using Tally = unsigned; // or: bit mask representing affected locations? using DownMap = boost::icl::interval_map<TimePoint, Tally>; 

    Now it's just bulk insertion:

     // We will do a tally of affected locations per time slot DownMap tallied; for (auto& location : records) for (auto& incident : location) tallied.add({incident, 1u}); 
  2. Ok, let's filter. We just need the predicate that works on our DownMap, right

     // define threshold where at least 2 locations have an outage auto exceeds_threshold = [](DownMap::value_type const& slot) { return slot.second >= 2; }; 
  3. Merge the time slots!

    Actually. We just create another DownTimes set, right. Just, not per location this time.

    The choice of data structure wins the day again:

     // just printing the union of any criticals: DownTimes merged; for (auto&& slot : tallied | filtered(exceeds_threshold) | map_keys) merged.insert(slot); 

Report!

std::cout << "Criticals: " << merged << "\n";

Note that nowhere did we come close to manipulating array indices, overlapping or non-overlapping intervals, closed or open boundaries. Or, [eeeeek!] brute force permutations of collection elements.

We just stated our goals, and let the library do the work.

Full Demo

Live On Coliru

#include <boost/icl/interval_set.hpp>
#include <boost/icl/interval_map.hpp>
#include <boost/range.hpp>
#include <boost/range/algorithm.hpp>
#include <boost/range/adaptors.hpp>
#include <boost/range/numeric.hpp>
#include <boost/range/irange.hpp>
#include <algorithm>
#include <iostream>
#include <vector>

using TimePoint = unsigned;
using DownTimes = boost::icl::interval_set<TimePoint>;
using Interval  = DownTimes::interval_type;
using Records   = std::vector<DownTimes>;

using Tally     = unsigned; // or: bit mask representing affected locations?
using DownMap   = boost::icl::interval_map<TimePoint, Tally>;

// Just for fun, removed the explicit loops from the generation too. Obviously,
// this is bit gratuitous :)
static DownTimes generate_downtime(int j) {
    return boost::accumulate(
            boost::irange(0, 5),
            DownTimes{},
            [j](DownTimes accum, int i) { return accum + Interval::closed((i*10), ((i*10) + 5 - j)); }
        );
}

int main() {
    // Initializing data for test
    using namespace boost::adaptors;
    auto const records = boost::copy_range<Records>(boost::irange(0,4) | transformed(generate_downtime));

    for (auto location : records | indexed()) {
        std::cout << "Location " << (location.index()+1) << " " << location.value() << std::endl;
    }

    // We will do a tally of affected locations per time slot
    DownMap tallied;
    for (auto& location : records)
        for (auto& incident : location)
            tallied.add({incident, 1u});

    // We will combine them so we get an interval_set defined for those periods
    // where at least 2 locations have an outage
    auto exceeds_threshold = [](DownMap::value_type const& slot) {
        return slot.second >= 2;
    };

    // just printing the union of any criticals:
    DownTimes merged;
    for (auto&& slot : tallied | filtered(exceeds_threshold) | map_keys)
        merged.insert(slot);

    std::cout << "Criticals: " << merged << "\n";
}

Which prints

Location 1 {[0,5][10,15][20,25][30,35][40,45]}
Location 2 {[0,4][10,14][20,24][30,34][40,44]}
Location 3 {[0,3][10,13][20,23][30,33][40,43]}
Location 4 {[0,2][10,12][20,22][30,32][40,42]}
Criticals: {[0,4][10,14][20,24][30,34][40,44]}

At the end of the permutation loop, you write:

numCombinations++;
std::cout << "[-INTERSEC-] " << combinations[numCombinations] << std::endl;  // The problem appears here

My debugger tells me that on the first iteration numCombinations was 0 before the increment. But incrementing it made it out of range for the combinations container (since that is only a single element, so having index 0).

Did you mean to increment it after the use? Was there any particular reason not to use

std::cout << "[-INTERSEC-] " << combinations.back() << "\n";

or, for c++03

std::cout << "[-INTERSEC-] " << combinations[combinations.size()-1] << "\n";

or even just:

std::cout << "[-INTERSEC-] " << combinations.at(numCombinations) << "\n";

which would have thrown std::out_of_range ?


On a side note, I think Boost ICL has vastly more efficient ways to get the answer you're after. Let me think about this for a moment. Will post another answer if I see it.

UPDATE : Posted the other answer show casing highlevel coding with Boost ICL

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM