Can a generic approach be used to reduce average effort performed by algorithms?

Question

I have encountered (and written) code where the natural use of the standard template library in Boolean relational expressions results in (potentially) wasted effort.

For example,

if (std::distance(begin, end) <= 2) { ... }

Or:

if (std::count(begin,end,val) >= 3) { ... }

In both these cases, it is possible to write a custom algorithm to avoid unnecessary iteration/evaluation when the answer is known after a partial evaluation of the range.

Is there a generic approach that can be used to prevent wasted effort in these situations?

EDIT: Trying to address the "close" votes.

For example, I could implement bool distance_at_least(begin, end, 3) and bool distance_at_most(begin, end, 2) and bool count_at_least(begin, end, val, 5) , etc.

I am asking for a single (generic) approach that can be used for all these types of queries.

EDIT: Here is a mock-up of a solution for one variant, which tries to convey why I am not keen to write many variants.

#include <vector>
#include <list>

namespace DETAIL {

    template <class ITER, class CAT>
    bool distance_at_least_dispatch(ITER begin, ITER end, typename std::iterator_traits<ITER>::difference_type n, CAT)
    {
        while (begin != end && n > 0) {
            ++begin;
            --n;
        }
        return n == 0;
    }

    template <class ITER>
    bool distance_at_least_dispatch(ITER begin, ITER end, typename std::iterator_traits<ITER>::difference_type n, std::random_access_iterator_tag)
    {
        return std::distance(begin, end) >= n;
    }
}

template <class ITER>
bool distance_at_least(ITER begin, ITER end, typename std::iterator_traits<ITER>::difference_type n)
{
    using CAT = typename std::iterator_traits<ITER>::iterator_category;
    return DETAIL::distance_at_least_dispatch(begin, end, n, CAT());
}

int main(int argv, char* argc[])
{
    std::vector<int> v;
    std::list<int> l;
    std::generate_n(std::back_inserter(v), 5, std::rand);
    std::generate_n(std::back_inserter(l), 5, std::rand);

    std::cout << distance_at_least(v.begin(), v.end(), 3) << std::endl;
    std::cout << distance_at_least(v.begin(), v.end(), 5) << std::endl;
    std::cout << distance_at_least(v.begin(), v.end(), 6) << std::endl;
    std::cout << distance_at_least(l.begin(), l.end(), 3) << std::endl;
    std::cout << distance_at_least(l.begin(), l.end(), 5) << std::endl;
    std::cout << distance_at_least(l.begin(), l.end(), 6) << std::endl;

    return 0;
}

Answer 1

I think the main question here should be whether std::count(b,e,v) < 2 is that bad on decent compilers. It's a template function, so the full implementation is available at the point of call, and it's a fairly simple function. This makes it a prime candidate for inlining. That means that the optimizer will see that the count return value is compared against <2 , followed by a branch. It is also visible that the count is incremented by one at a time, and that there are no side effects.

Hence, it's possible to reshuffle the code, move the branch, and eliminate the redundant part of the loop.

In pseudocode, unoptimized:

count = 0
:loop
if (*b==v) ++count
++b
if(b!=e) goto loop
if(count >=2) goto past_if
// Then-block
:past_if

And now optimized

count = 0
:loop
if (*b==v) ++count
if(count>=2) goto past_if
++b
if(b!=e) goto loop
// Then-block
:past_if

As you can see, this is a simple reordering. Only one line moves. This isn't rocket science, a decent optimizer should be able to figure this out.

Can a generic approach be used to reduce average effort performed by algorithms?

Question

1 answers

solution1
0 2015-09-30 08:36:48

Can a generic approach be used to reduce average effort performed by algorithms?

Question

1 answers

solution1 0 2015-09-30 08:36:48

solution1
0 2015-09-30 08:36:48