简体   繁体   English

如果没有if,std :: count_if会更快吗?

[英]Would std::count_if be faster without an if?

Here's the gcc std::count_if code 这是gcc std::count_if代码

template<typename _InputIterator, typename _Predicate>
  typename iterator_traits<_InputIterator>::difference_type
  count_if(_InputIterator __first, _InputIterator __last, _Predicate __pred)
 {
  [snip]
  typename iterator_traits<_InputIterator>::difference_type __n = 0;
  for (; __first != __last; ++__first)
    if (__pred(*__first))
      ++__n;
  return __n;
}

My question: would it work better (ie, faster) to use 我的问题:使用它会更好(即更快)

__n += __pred(*__first); // instead of the if statement

This version always does an add, but doesn't do a branch. 此版本始终执行添加,但不执行分支。

The replacement you gave is not equivalent, because there are far fewer restrictions on a predicate than you think: 您提供的替换等同,因为对谓词的限制远远少于您的想法:

  • Anything which can be used in a conditional context (can be contextually converted to bool ), is a valid return-type for the predicate (an explicit conversion to bool is enough). 可以在条件上下文中使用的任何内容(可以在上下文中转换为bool )是谓词的有效返回类型( explicit转换为bool就足够了)。
  • That return-type can react funny to being added to the iterators difference-type. 返回类型可以很有趣地添加到迭代器差异类型。

25 Algorithms library [algorithms] 25算法库[algorithms]

25.1 General [algorithms.general] 25.1一般[algorithms.general]

8 The Predicate parameter is used whenever an algorithm expects a function object (20.9) that, when applied to the result of dereferencing the corresponding iterator, returns a value testable as true . 8 Predicate参数用于每当算法期望的功能对象(20.9),当施加到解引用相应的迭代的结果,返回可测试作为值true In other words, if an algorithm takes Predicate pred as its argument and first as its iterator argument, it should work correctly in the construct pred(*first) contextually converted to bool (Clause 4) . 换句话说,如果算法将Predicate pred作为其参数并first作为其迭代器参数,则它应该在构造pred(*first)上下文中正确转换为bool (第4条) The function object pred shall not apply any non-constant function through the dereferenced iterator. 函数对象pred不应通过解引用的迭代器应用任何非常量函数。

The most likely return giving your replacement indigestion would be a standard integer-type, and a value neither 0 nor 1. 给予替代消化不良的最可能的回报是标准整数类型,并且值既不是0也不是1。

Also, keep in mind that compilers can actually optimize really good nowadays (and especially C++ ones need to, with all that template-stuff layered deep). 另外,请记住编译器现在实际上可以真正优化(特别是C ++需要,所有模板 - 东西分层深)。

So, first, your suggested code is different. 所以,首先,您建议的代码是不同的。 So let's look at two equivalent codes: 那么让我们看看两个等价的代码:

template<typename _InputIterator, typename _Predicate>
typename iterator_traits<_InputIterator>::difference_type
count_if(_InputIterator __first, _InputIterator __last, _Predicate __pred) {
    typename iterator_traits<_InputIterator>::difference_type __n = 0;
    for (; __first != __last; ++__first)
        if (__pred(*__first))
            ++__n;
    return __n;
}

And: 和:

template<typename _InputIterator, typename _Predicate>
typename iterator_traits<_InputIterator>::difference_type
count_if(_InputIterator __first, _InputIterator __last, _Predicate __pred) {
    typename iterator_traits<_InputIterator>::difference_type __n = 0;
    for (; __first != __last; ++__first)
        __n += (bool) __pred(*__first);
    return __n;
}

Then, we can compile this with our compiler and look at the assembly. 然后,我们可以使用编译器编译它并查看程序集。 Under one compiler that I tried (clang on os x), these produced identical code . 在我尝试过的一个编译器(os x on clax)下,这些编译器产生了相同的代码

Perhaps your compiler will also produce identical code, or perhaps it might produce different code. 也许你的编译器也会产生相同的代码,或者它可能产生不同的代码。

Technically it would, but keep in mind that all values great than 0 evaluate to true . 从技术上讲,它会,但请记住,所有大于0值都会评估为true So the called function might return a value other than 1 , which would skew the result. 因此被调用的函数可能会返回一个不是1的值,这会使结果产生偏差。 Also, the compiler has means to optimize the branch away into a conditional move. 此外,编译器还具有将分支优化为条件移动的方法。

To expand, there are certainly ways to optimize the branch away in code, but this reduces readability and maintainability as well as the ability to debug the code by eg. 为了扩展,有一些方法可以在代码中优化分支,但这会降低可读性和可维护性,以及通过例如调试代码的能力。 placing breakpoints down, and gaining very little since compilers are pretty damn good at optimzing these things on their own. 把断点放下来,并且获得很少,因为编译器非常善于自己优化这些东西。

The code generated by the compiler does not necessarily literally reproduce C++ language constructs in machine code. 由编译器生成的代码不必在字面上机器代码重现C ++语言构造。 Just because your C++ code has an if statement in it does not mean that machine code will be based on a branching instruction. 仅仅因为您的C ++代码中包含if语句并不意味着机器代码将基于分支指令。 Modern compilers are not required to and do not literally implement the behavior of the abstract C++ machine in the generated machine code. 现代编译器不需要也不要在生成的机器代码中实现抽象C ++机器的行为。

For this reason it is impossible to say whether it will be faster or not. 因此,不可能说它是否会更快。 C++ code does not have any inherent "speed" associated with it. C ++代码没有任何与之相关的固有“速度”。 C++ code is never executed directly. C ++代码永远不会直接执行。 It can't be "faster" or "slower" from the abstract point of view. 从抽象的角度来看,它不能“更快”或“更慢”。 If you want to analyze the performance of the code by looking at it, you have to look at the machine code generated by your compiler, not at C++ code. 如果要通过查看代码来分析代码的性能,则必须查看编译器生成的机器代码,而不是C ++代码。 But an even better method would be to try both variants and profile them by actually running them on various kinds of typical input data. 但是更好的方法是尝试两种变体并通过在各种典型输入数据上实际运行它们来对它们进行分析。

It is quite possible that a smart compiler will generate identical code for both of your variants. 智能编译器很可能会为您的两个变体生成相同的代码。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM