简体   繁体   English

提升多精度 cpp_int 输出/到字符串非常慢

[英]boost multiprecision cpp_int output/to string very slow

Playing around with the boost multiprecision library.玩转 boost 多精度库。 Calculating some big factorial numbers and such.计算一些大的阶乘数等。

Problem is the output takes too long.问题是 output 花费的时间太长。 100,000. 100,000。 takes 0.5 seconds to calculate and 11 seconds to print, 1,000.000!计算需要 0.5 秒,打印需要 11 秒,1,000.000! takes half an hour (yes the output only).需要半小时(是的,仅限 output)。

Using cout for the output with > to put it to a file: ./prg > file使用带有 > 的 output 的 cout 将其放入文件:./prg > file

Tried putting it to a string first, stringstream, normal cout.尝试先将其放入字符串,stringstream,普通 cout。 Everything the same.一切都一样。

The convertion to a string just takes very long.转换为字符串只需要很长时间。 Is there any way to speed it up?有什么办法可以加快速度吗?

Code example:代码示例:

#include <iostream>
#include <chrono>
#include <string>
#include <boost/multiprecision/cpp_int.hpp>

int main() {
    uint32_t num = 100000;
    boost::multiprecision::cpp_int result = 1;

    std::chrono::steady_clock::time_point begin = std::chrono::steady_clock::now();

    for (uint32_t i = 2; i <= num; i++) {
        result *= i;
    }

    std::chrono::steady_clock::time_point end = std::chrono::steady_clock::now();
    std::cout << "calculation: " << std::chrono::duration_cast<std::chrono::milliseconds> (end - begin).count() / 1000.0 << " sec" << std::endl;

    std::string s = result.str();

    std::chrono::steady_clock::time_point endOutput = std::chrono::steady_clock::now();
    std::cout << "toString: " << std::chrono::duration_cast<std::chrono::milliseconds> (endOutput - end).count() / 1000.0 << " sec" << std::endl;

    std::cout << "length: " << s.length() << std::endl;

    return 0;
}

output: output:

calculation: 1.014 sec
toString: 7.643 sec
length: 456574

output same code in java using BigInteger: output java 中使用 BigInteger 的相同代码:

calculation: 2.646 sec
toString: 0.466 sec
length: 456574

I was going to recommend you post a self-answer.我打算建议你发布一个自我回答。 But then I had this crammed into a single comment:但后来我把这个塞进了一条评论中:

feel free to self-answer @Richard, so the hint will help others.随时自我回答@Richard,所以提示会帮助其他人。 Also, a pet-peeve of mine: C++ doesn't need to look horrible: https://godbolt.org/z/53d338843 .另外,我的一个小毛病:C++ 不需要看起来很可怕: https://godbolt.org/z/53d338843 In fact I'd go further and make a lap function https://godbolt.org/z/6nf659Y91 .事实上,我会进一步 go 并跑lap function https://godbolt.org/z/6nf659Y91 Also nice: https://wandbox.org/permlink/3KCKBn1tOwwJCkIz也不错: https://wandbox.org/permlink/3KCKBn1tOwwJCkIz

So I figured that I might as well post it here for added value of demonstration code.所以我想我不妨把它贴在这里以获得演示代码的附加值。

Modernizing The Repro Repro 现代化

We can express that program a lot cleaner:我们可以更清晰地表达该程序:

#include <boost/multiprecision/gmp.hpp>
#include <chrono>
#include <iostream>

using namespace std::chrono_literals;
namespace bmp = boost::multiprecision;
auto now      = std::chrono::steady_clock::now;

auto factorial(uint32_t num) {
    bmp::mpz_int result{1};
    while (num)
        result *= num--;
    return result;
}

int main() {
    auto start = now();
    auto result = factorial(100'000);

    auto mid = now();
    std::cout << "calculation: " << (mid - start) / 1.s << "s" << std::endl;

    std::string s = result.str();

    std::cout << "toString: "    << (now() - mid) / 1.s << "s" << std::endl;
    std::cout << "length: "      << s.length()          << "\n";
}

Which may print something like这可能会打印出类似的东西

calculation: 2.17467s
toString: 0.0512504s
length: 456574

More Comparative Benchmarks更多比较基准

To see how much better mpz_int may perform, let's compare them:要查看mpz_int的性能有多好,让我们比较它们:

Live On Wandbox现场直播

#include <boost/multiprecision/cpp_int.hpp>
#include <boost/multiprecision/gmp.hpp>
#include <chrono>
#include <iostream>

using namespace std::chrono_literals;
namespace bmp = boost::multiprecision;

template <typename T> T factorial(uint32_t num) {
    T result{1};
    while (num)
        result *= num--;
    return result;
}

#define TIMED(expr)                                                            \
    [&]() -> decltype(auto) {                                                  \
        using C = std::chrono::steady_clock;                                   \
        using namespace std::chrono_literals;                                  \
        struct X {                                                             \
            C::time_point s = C::now();                                        \
            ~X() {                                                             \
                std::cerr << std::fixed << (C::now() - s) / 1.s << "s\t"       \
                          << #expr << std::endl;                               \
            }                                                                  \
        } x;                                                                   \
        return (expr);                                                         \
    }()

template <typename T> void bench() {
    auto r = TIMED(factorial<T>(50'000));
    auto s = TIMED(r.str());
    std::cout << "length: " << s.length() << "\n";
}

int main() {
    TIMED(bench<bmp::mpz_int>());
    std::cout << "-----\n";
    TIMED(bench<bmp::cpp_int>());
}

Which may print something like这可能会打印出类似的东西

0.953427s       factorial<T>(100'000)
0.040691s       r.str()
length: 456574
0.994284s       bench<bmp::mpz_int>()
-----
1.410608s       factorial<T>(100'000)
8.014350s       r.str()
length: 456574
9.425064s       bench<bmp::cpp_int>()

As you can see GMP is orders of magnitude more optimized (~200x faster for the str() operation in my test run)如您所见,GMP 的优化程度提高了几个数量级(在我的测试运行中, str()操作的速度提高了约 200 倍)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM