简体   繁体   English

c++ 中的 isdigit() 不应该更快吗?

[英]Shouldn't isdigit() be faster in c++?

I am using isdigit() function in c++, but i found it's slow, so i implemented my own is_digit() , see my code below:我在 c++ 中使用isdigit() function,但我发现它很慢,所以我实现了自己的is_digit() ,请参阅下面的代码:

#include<iostream>
#include<cctype>
#include<ctime>
using namespace std;
static inline bool is_digit(char c)
{
    return c>='0'&&c<='9';
}
int main()
{
    char c='8';
    time_t t1=clock(),t2,t3;
    for(int i=0;i<1e9;i++)
        is_digit(c);
    t2=clock();
    for(int i=0;i<1e9;i++)
        isdigit(c);
    t3=clock();
    cout<<"is_digit:"<<(t2-t1)/CLOCKS_PER_SEC<<"\nisdigit:"<<(t3-t2)/CLOCKS_PER_SEC<<endl;

    return 0;
}

After running, is_digit() took only 1 second(1161ms), but isdigit() took 4 seconds(3674ms), I know that isdigit is implemented by bit operation, Shouldn't isdigit() be faster than is_digit() ?运行后is_digit()只用了1秒(1161ms),而isdigit()却用了4秒(3674ms),我知道isdigit是位运算实现的, isdigit()不应该比is_digit()快吗?


update1更新1

I use MS VS2010 with default option, release version, how do i do to make isdigit() faster than is_digit() in VS?我使用带有默认选项的 MS VS2010,发布版本,如何使isdigit()比 VS 中的is_digit() ) 更快?

update2更新2

Thanks to all of you.感谢大家。 When in release mode in VS, project will be optimized for speed default( -O2 ).在 VS 中处于发布模式时,项目将针对速度默认值 ( -O2 ) 进行优化。

All in release mode.全部处于发布模式。

VS2010: is_digit:1182(ms) isdigit:3724(ms) VS2010: is_digit:1182(ms) isdigit:3724(ms)

VS2013: is_digit:0(ms) isdigit:3806(ms) VS2013: is_digit:0(ms) isdigit:3806(ms)

Codeblocks with g++(4.7.1) with -O3: is_digit:1275(ms) isdigit:1331(ms)带有 g++(4.7.1) 和 -O3 的代码块:is_digit:1275(ms) isdigit:1331(ms)

So here is the conclusion:所以这是结论:

is_digit() is faster than isdigit() in VS but slower than isdigit() in g++. is_digit() isdigit()快,但在 g++ 中比isdigit()慢。

And isdigit() in g++ is faster than isdigit() in VS. g++ 中的isdigit() isdigit() VS 中的 isdigit() 更快。

So "VS sucks" in performance?所以“VS 很烂”的表现?

In clang/llvm [the compiler of my choice], isdigit and is_digit will turn into exactly the same code, as it has optimisation for that specific library call to translate it into ((unsigned)(c-48) < 10u) . 在clang / llvm [我选择的编译器]中, isdigitis_digit将变成完全相同的代码,因为它具有针对特定库调用的优化以将其转换为((unsigned)(c-48) < 10u)

The return c>='0' && c <='9'; return c>='0' && c <='9'; is also turned into c-48 > 10 by the optimisation (as a generic if x >= N && x <= M -> xN > (MN) conversion that the compiler does). 通过优化也可以将c-48 > 10变为c-48 > 10 xN > (MN)编译时所做的if x >= N && x <= M - > xN > (MN)转换)。

So, in theory, both the loops SHOULD turn into the same code (at least with a compiler that has this type of optimisation for isdigit - whether MSVC does or not, I can't say, as the source code is not available to the general public). 因此,理论上,两个循环应该变成相同的代码(至少对于具有这种类型的isdigit优化的编译器 - 无论MSVC是否成功,我都不能说,因为源代码不可用于一般公众)。 I know that gcc has similar code to optimise library calls, but I don't have gcc source on my machine at present, and I can't be bothered to look it up [in my experience, it'll be a bit more difficult to read than the llvm code, anyways]. 我知道gcc有类似的代码来优化库调用,但我目前在我的机器上没有gcc源代码,我也不会费心去查找[根据我的经验,它会有点困难阅读比llvm代码,反正]。

Code in llvm: llvm中的代码:

Value *LibCallSimplifier::optimizeIsDigit(CallInst *CI, IRBuilder<> &B) {
  Function *Callee = CI->getCalledFunction();
  FunctionType *FT = Callee->getFunctionType();
  // We require integer(i32)
  if (FT->getNumParams() != 1 || !FT->getReturnType()->isIntegerTy() ||
      !FT->getParamType(0)->isIntegerTy(32))
    return nullptr;

  // isdigit(c) -> (c-'0') <u 10
  Value *Op = CI->getArgOperand(0);
  Op = B.CreateSub(Op, B.getInt32('0'), "isdigittmp");
  Op = B.CreateICmpULT(Op, B.getInt32(10), "isdigit");
  return B.CreateZExt(Op, CI->getType());
}

For those not familiar with LLVM code: It first checks that the function call has the correct number of parameters and parameter types. 对于那些不熟悉LLVM代码的人:它首先检查函数调用是否具有正确数量的参数和参数类型。 If that fails, it returns NULL to indicate "I can't optimise this". 如果失败,则返回NULL以指示“我无法优化此”。 Otherwise, it builds the chain of operations to do the if (c - '0' > 10) using unsigned comparison to cope with "negative" values [which in unsigned are huge values]. 否则,它构建操作链以使用无符号比较来执行if (c - '0' > 10)以处理“无效”值[无符号值是巨大的值]。

It would goes wrong if you do this: 如果你这样做会出错:

bool isdigit(int x)
{
   return image_contains_finger(imagefiles[x]); 
}

[But then replacing library functions with your own version that does something will most likely have interesting effects in general!] [但是,用你自己的版本替换库函数可能会产生一些有趣的效果!]

Have a look on this code (works with g++) with -O3 看看这个代码(与g ++一起使用)和-O3

#include<iostream>
#include<cctype>
#include<ctime>
#include <time.h>
#include <sys/time.h>
using namespace std;
static inline bool is_digit(char c)
{
    return c>='0'&&c<='9';
}
int main()
{
    char c='8';
    struct timeval tvSt, tvEn;
    time_t t1=clock(),t2,t3;
    gettimeofday(&tvSt, 0);
    for(int i=0;i<1e9;i++)
        is_digit(c);
    gettimeofday(&tvEn, 0);
    cout << "is_digit:" << (tvEn.tv_sec - tvSt.tv_sec)*1000000 + (tvEn.tv_usec - tvSt.tv_usec) << " us"<< endl;
    gettimeofday(&tvSt, 0);
    for(int i=0;i<1e9;i++)
        isdigit(c);
    gettimeofday(&tvEn, 0);
    cout << "isdigit:" << (tvEn.tv_sec - tvSt.tv_sec)*1000000 + (tvEn.tv_usec - tvSt.tv_usec) << " us"<< endl;

    return 0;
}

Results: 结果:

is_digit:1610771 us
isdigit:1055976 us

So, C++ implementation beats yours. 所以,C ++实现比你好。
Normally, when you measure performance, it's not a good idea to do it with seconds. 通常情况下,当你衡量表现时,用秒来做这件事并不是一个好主意。 At lease consider microseconds level. 至少考虑微秒级别。

I'm not sure about VS. 我不确定VS. Please find out microsecond level clock and measure. 请找出微秒级时钟和测量。

PS. PS。 Please refer https://msdn.microsoft.com/en-us/library/19z1t1wy.aspx for VS optimizations 有关VS优化,请参阅https://msdn.microsoft.com/en-us/library/19z1t1wy.aspx

Your function is_digit can be implemented faster by: 您的函数is_digit可以通过以下方式更快地实现:

#define ISDIGIT(X) (((uint32_t)X - '0') < 10u)

where you save one comparison. 你保存一个比较的地方。 I think, that this is the normal approch in gcc, but in Microsoft Visual Studio i guess you had a localized version of isdigit() (which therefor takes a long time in checking locales). 我认为,这是gcc中的正常approch,但在Microsoft Visual Studio中我猜你有一个isdigit()的本地化版本(因此需要很长时间检查语言环境)。

The proposed benchmark from @Doonyx fell into several pitfalls: @Doonyx 提出的基准测试陷入了几个陷阱:

  1. using a constant char c = '8';使用常量 char c = '8'; ... any compiler would understand it doesn't change and could cache or skip the result. ...任何编译器都会理解它不会改变并且可以缓存或跳过结果。
  2. the loop is running a function but the result is not used anywhere => again, compilers can just skip the loop altogether.循环正在运行 function 但结果未在任何地方使用 => 同样,编译器可以完全跳过循环。
  3. it doesn't take into account the CPU performance delta, the CPU could take some time to "wake up", and generally its performance can very over time.它没有考虑 CPU 性能增量,CPU 可能需要一些时间才能“唤醒”,而且通常它的性能会随着时间的推移而变化。

=> I made modifications of that benchmark to solve all 3 points. => 我对该基准进行了修改以解决所有 3 点。

// gcc main.cpp -O3 -std=c++20 -lstdc++ && ./a.out

#include <chrono>
#include <iomanip>
#include <iostream>
#include <map>
#include <vector>

// basic function
static inline bool is_digit(char c)
{
    return c >= '0' && c <= '9';
}

// optimized function
constexpr bool is_digit2(int c)
{
    return (uint32_t)(c - '0') < 10u;
}

constexpr int NUM_STEP = 8;
constexpr int TRIM     = 2;

#define NOW_NS() std::chrono::high_resolution_clock::now().time_since_epoch().count()

int main()
{
    int64_t                                     sum;
    std::map<std::string, std::vector<int64_t>> nameTimes;
    std::map<std::string, int64_t>              nameAvgs;

// convenience define to run the benchmark
#define RUN_BENCH(name, code)                                                        \
    do                                                                               \
    {                                                                                \
        const auto start = NOW_NS();                                                 \
        sum              = 0;                                                        \
        for (int i = 0; i < 1000000000; ++i)                                         \
            sum += code;                                                             \
        const auto name##Time = NOW_NS() - start;                                    \
        nameTimes[#name].push_back(name##Time);                                      \
        std::cout << step << " " << std::setw(11) << #name << ": "                   \
                << std::setw(10) << name##Time << " ns  sum=" << sum << std::endl; \
    }                                                                                \
    while (0)

    // 1) run the benchmark NUM_STEP times
    // note that a null test is added to compute the overhead
    for (int step = 0; step < NUM_STEP; ++step)
    {
        RUN_BENCH(_null, i & 15);
        RUN_BENCH(is_digit, is_digit(i & 255));
        RUN_BENCH(is_digit2, is_digit2(i & 255));
        RUN_BENCH(std_isdigit, std::isdigit(i & 255));
    }

    // 2) remove the 25% slowest and 25% fastest runs for each benchmark (Interquartile range)
    std::cout << "\ncombining:\n";
    for (auto& [name, times] : nameTimes)
    {
        int64_t total = 0;
        std::sort(times.begin(), times.end());
        std::cout << std::setw(11) << name;
        for (int i = 0; i < NUM_STEP; ++i)
        {
            std::cout << " " << i << ":" << times[i];
            if (i >= TRIM && i < NUM_STEP - TRIM)
            {
                std::cout << "*";
                total += times[i];
            }
        }
        total /= (NUM_STEP - TRIM * 2);
        std::cout << " => " << total << " ns\n";
        nameAvgs[name] = total;
    }

    // 3) show the results + results MINUS the overhead (null time)
    std::cout << "\nsummary:\n";
    for (auto& [name, time] : nameAvgs)
    {
        std::cout << std::setw(11) << name << ": " << std::setw(10) << time << " ns "
                << " time-null: " << std::setw(10) << time - nameAvgs["_null"] << " ns\n";
    }

    return 0;
}

So, each benchmark is a bit more complex and forces the compiler to actually execute the code, they're run sequentially, and then 8 times, to take the CPU performance variation into account, and then the slowest/fastest runs are discarded, and in the final summary, the time of the overhead is subtracted, to have an idea of the true speed of the functions.因此,每个基准测试都有点复杂,迫使编译器实际执行代码,它们按顺序运行,然后运行 8 次,以考虑 CPU 性能变化,然后丢弃最慢/最快的运行,并且在最后的总结中,减去开销的时间,以了解功能的真实速度。

gcc 11.2.0 with -O0:
      _null:  680327226 ns  time-null:          0 ns
   is_digit: 1368190759 ns  time-null:  687863533 ns
  is_digit2: 1223091465 ns  time-null:  542764239 ns
std_isdigit:  733283544 ns  time-null:   52956318 ns *

msvc 17.3.4 with -O0:
      _null:  576647075 ns  time-null:          0 ns
   is_digit: 1348345625 ns  time-null:  771698550 ns
  is_digit2:  754253650 ns  time-null:  177606575 ns *
std_isdigit: 1619403975 ns  time-null: 1042756900 ns

gcc 11.2.0 with -O1:
      _null:  217714988 ns  time-null:          0 ns
   is_digit:  459088203 ns  time-null:  241373215 ns
  is_digit2:  434988334 ns  time-null:  217273346 ns *
std_isdigit:  435391905 ns  time-null:  217676917 ns *

msvc 17.3.4 with -O1:
      _null:  217425875 ns  time-null:          0 ns
   is_digit:  442688400 ns  time-null:  225262525 ns *
  is_digit2:  440954975 ns  time-null:  223529100 ns *
std_isdigit: 1187352900 ns  time-null:  969927025 ns

gcc 11.2.0 with -O2:
      _null:  217411308 ns  time-null:          0 ns
   is_digit:  542259068 ns  time-null:  324847760 ns
  is_digit2:  434180245 ns  time-null:  216768937 ns *
std_isdigit:  435705056 ns  time-null:  218293748 ns *

msvc 17.3.4 with -O2:
      _null:  209602025 ns  time-null:          0 ns
   is_digit:  441704325 ns  time-null:  232102300 ns
  is_digit2:  298747075 ns  time-null:   89145050 ns *
std_isdigit: 1198361400 ns  time-null:  988759375 ns

gcc 11.2.0 with -O3:
      _null:  126789606 ns  time-null:          0 ns
   is_digit:  206127551 ns  time-null:   79337945 ns
  is_digit2:  175606336 ns  time-null:   48816730 ns *
std_isdigit:  174991923 ns  time-null:   48202317 ns *

msvc 17.3.4 with -Ox:
      _null:  206283850 ns  time-null:          0 ns
   is_digit:  434584200 ns  time-null:  228300350 ns
  is_digit2:  312153225 ns  time-null:  105869375 ns *
std_isdigit: 1176565150 ns  time-null:  970281300 ns

Conclusion:结论:

  • on gcc, std::isdigit is as fast as the is_digit2 function在 gcc 上, std::isdigitis_digit2 function 一样快
  • on msvc, std::isdigit is 9x slower than is_digit2 (but this might be due to a locale setting)在 msvc 上, std::isdigitis_digit2慢 9 倍(但这可能是由于语言环境设置所致)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM