[英]boost multiprecision cpp_int output/to string very slow
Playing around with the boost multiprecision library.玩转 boost 多精度库。 Calculating some big factorial numbers and such.
计算一些大的阶乘数等。
Problem is the output takes too long.问题是 output 花费的时间太长。 100,000.
100,000。 takes 0.5 seconds to calculate and 11 seconds to print, 1,000.000!
计算需要 0.5 秒,打印需要 11 秒,1,000.000! takes half an hour (yes the output only).
需要半小时(是的,仅限 output)。
Using cout for the output with > to put it to a file: ./prg > file使用带有 > 的 output 的 cout 将其放入文件:./prg > file
Tried putting it to a string first, stringstream, normal cout.尝试先将其放入字符串,stringstream,普通 cout。 Everything the same.
一切都一样。
The convertion to a string just takes very long.转换为字符串只需要很长时间。 Is there any way to speed it up?
有什么办法可以加快速度吗?
Code example:代码示例:
#include <iostream>
#include <chrono>
#include <string>
#include <boost/multiprecision/cpp_int.hpp>
int main() {
uint32_t num = 100000;
boost::multiprecision::cpp_int result = 1;
std::chrono::steady_clock::time_point begin = std::chrono::steady_clock::now();
for (uint32_t i = 2; i <= num; i++) {
result *= i;
}
std::chrono::steady_clock::time_point end = std::chrono::steady_clock::now();
std::cout << "calculation: " << std::chrono::duration_cast<std::chrono::milliseconds> (end - begin).count() / 1000.0 << " sec" << std::endl;
std::string s = result.str();
std::chrono::steady_clock::time_point endOutput = std::chrono::steady_clock::now();
std::cout << "toString: " << std::chrono::duration_cast<std::chrono::milliseconds> (endOutput - end).count() / 1000.0 << " sec" << std::endl;
std::cout << "length: " << s.length() << std::endl;
return 0;
}
output: output:
calculation: 1.014 sec
toString: 7.643 sec
length: 456574
output same code in java using BigInteger: output java 中使用 BigInteger 的相同代码:
calculation: 2.646 sec
toString: 0.466 sec
length: 456574
I was going to recommend you post a self-answer.我打算建议你发布一个自我回答。 But then I had this crammed into a single comment:
但后来我把这个塞进了一条评论中:
feel free to self-answer @Richard, so the hint will help others.
随时自我回答@Richard,所以提示会帮助其他人。 Also, a pet-peeve of mine: C++ doesn't need to look horrible: https://godbolt.org/z/53d338843 .
另外,我的一个小毛病:C++ 不需要看起来很可怕: https://godbolt.org/z/53d338843 。 In fact I'd go further and make a
lap
function https://godbolt.org/z/6nf659Y91 .事实上,我会进一步 go 并跑
lap
function https://godbolt.org/z/6nf659Y91 。 Also nice: https://wandbox.org/permlink/3KCKBn1tOwwJCkIz也不错: https://wandbox.org/permlink/3KCKBn1tOwwJCkIz
So I figured that I might as well post it here for added value of demonstration code.所以我想我不妨把它贴在这里以获得演示代码的附加值。
We can express that program a lot cleaner:我们可以更清晰地表达该程序:
#include <boost/multiprecision/gmp.hpp>
#include <chrono>
#include <iostream>
using namespace std::chrono_literals;
namespace bmp = boost::multiprecision;
auto now = std::chrono::steady_clock::now;
auto factorial(uint32_t num) {
bmp::mpz_int result{1};
while (num)
result *= num--;
return result;
}
int main() {
auto start = now();
auto result = factorial(100'000);
auto mid = now();
std::cout << "calculation: " << (mid - start) / 1.s << "s" << std::endl;
std::string s = result.str();
std::cout << "toString: " << (now() - mid) / 1.s << "s" << std::endl;
std::cout << "length: " << s.length() << "\n";
}
Which may print something like这可能会打印出类似的东西
calculation: 2.17467s
toString: 0.0512504s
length: 456574
To see how much better mpz_int
may perform, let's compare them:要查看
mpz_int
的性能有多好,让我们比较它们:
#include <boost/multiprecision/cpp_int.hpp>
#include <boost/multiprecision/gmp.hpp>
#include <chrono>
#include <iostream>
using namespace std::chrono_literals;
namespace bmp = boost::multiprecision;
template <typename T> T factorial(uint32_t num) {
T result{1};
while (num)
result *= num--;
return result;
}
#define TIMED(expr) \
[&]() -> decltype(auto) { \
using C = std::chrono::steady_clock; \
using namespace std::chrono_literals; \
struct X { \
C::time_point s = C::now(); \
~X() { \
std::cerr << std::fixed << (C::now() - s) / 1.s << "s\t" \
<< #expr << std::endl; \
} \
} x; \
return (expr); \
}()
template <typename T> void bench() {
auto r = TIMED(factorial<T>(50'000));
auto s = TIMED(r.str());
std::cout << "length: " << s.length() << "\n";
}
int main() {
TIMED(bench<bmp::mpz_int>());
std::cout << "-----\n";
TIMED(bench<bmp::cpp_int>());
}
Which may print something like这可能会打印出类似的东西
0.953427s factorial<T>(100'000)
0.040691s r.str()
length: 456574
0.994284s bench<bmp::mpz_int>()
-----
1.410608s factorial<T>(100'000)
8.014350s r.str()
length: 456574
9.425064s bench<bmp::cpp_int>()
As you can see GMP is orders of magnitude more optimized (~200x faster for the str()
operation in my test run)如您所见,GMP 的优化程度提高了几个数量级(在我的测试运行中,
str()
操作的速度提高了约 200 倍)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.