简体   繁体   English

使用gcc没有小的字符串优化?

[英]No small string optimization with gcc?

Most std::string implementations (GCC included) use small string optimization. 大多数std::string实现(包括GCC)使用小字符串优化。 Eg there's an answer discussing this. 例如,有一个答案在讨论这个问题。

Today, I decided to check at what point a string in a code I compile gets moved to the heap. 今天,我决定检查我编译的代码中的字符串在什么时候被移动到堆中。 To my surprise, my test code seems to show that no small string optimization occurs at all! 令我惊讶的是,我的测试代码似乎表明根本没有发生小的字符串优化!

Code: 码:

#include <iostream>
#include <string>

using std::cout;
using std::endl;

int main(int argc, char* argv[]) {
  std::string s;

  cout << "capacity: " << s.capacity() << endl;

  cout << (void*)s.c_str() << " | " << s << endl;
  for (int i=0; i<33; ++i) {
    s += 'a';
    cout << (void*)s.c_str() << " | " << s << endl;
  }

}

The output of g++ test.cc && ./a.out is g++ test.cc && ./a.out的输出是

capacity: 0
0x7fe405f6afb8 | 
0x7b0c38 | a
0x7b0c68 | aa
0x7b0c38 | aaa
0x7b0c38 | aaaa
0x7b0c68 | aaaaa
0x7b0c68 | aaaaaa
0x7b0c68 | aaaaaaa
0x7b0c68 | aaaaaaaa
0x7b0c98 | aaaaaaaaa
0x7b0c98 | aaaaaaaaaa
0x7b0c98 | aaaaaaaaaaa
0x7b0c98 | aaaaaaaaaaaa
0x7b0c98 | aaaaaaaaaaaaa
0x7b0c98 | aaaaaaaaaaaaaa
0x7b0c98 | aaaaaaaaaaaaaaa
0x7b0c98 | aaaaaaaaaaaaaaaa
0x7b0cd8 | aaaaaaaaaaaaaaaaa
0x7b0cd8 | aaaaaaaaaaaaaaaaaa
0x7b0cd8 | aaaaaaaaaaaaaaaaaaa
0x7b0cd8 | aaaaaaaaaaaaaaaaaaaa
0x7b0cd8 | aaaaaaaaaaaaaaaaaaaaa
0x7b0cd8 | aaaaaaaaaaaaaaaaaaaaaa
0x7b0cd8 | aaaaaaaaaaaaaaaaaaaaaaa
0x7b0cd8 | aaaaaaaaaaaaaaaaaaaaaaaa
0x7b0cd8 | aaaaaaaaaaaaaaaaaaaaaaaaa
0x7b0cd8 | aaaaaaaaaaaaaaaaaaaaaaaaaa
0x7b0cd8 | aaaaaaaaaaaaaaaaaaaaaaaaaaa
0x7b0cd8 | aaaaaaaaaaaaaaaaaaaaaaaaaaaa
0x7b0cd8 | aaaaaaaaaaaaaaaaaaaaaaaaaaaaa
0x7b0cd8 | aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
0x7b0cd8 | aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
0x7b0cd8 | aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
0x7b0d28 | aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

I'm guessing that the larger first pointer, ie 0x7fe405f6afb8 is a stack pointer, and the other ones point to the heap. 我猜测较大的第一个指针,即0x7fe405f6afb8是一个堆栈指针,其他指针指向堆。 Running this many times produces identical results, in the sense that the first address is always large, and the other ones are smaller; 多次运行会产生相同的结果,因为第一个地址总是很大,而其他地址都比较小; the exact values usually differ. 确切的值通常不同。 The smaller addresses always follow the standard power of 2 allocation scheme, eg 0x7b0c38 is listed once, then 0x7b0c68 is listed once, then 0x7b0c38 twice, then 0x7b0c68 4 times, then 0x7b0c98 8 times, etc. 较小的地址始终遵循2分配方案的标准功率,例如0x7b0c38列出一次,然后0x7b0c68列出一次,然后0x7b0c38列出两次,然后0x7b0c68 4次,然后0x7b0c98 8次,等等。

After reading Howard's answer, using a 64bit machine, I was expecting to see the same address printed for the first 22 characters, and only then to see it change. 在阅读霍华德的答案后,使用64位机器,我希望看到前22个字符打印的地址相同,然后才能看到它发生变化。

Am I missing something? 我错过了什么吗?

Also, interestingly, if I compile with -O (at any level), I get a constant small pointer value 0x6021f8 in the first case, instead of the large value, and this 0x6021f8 doesn't change regardless of how many times I run the program. 另外,有趣的是,如果我使用-O (在任何级别)编译,我在第一种情况下获得一个常量小指针值0x6021f8 ,而不是大值,并且无论我运行多少次,这个0x6021f8都不会改变程序。

Output of g++ -v : 输出g++ -v

Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/foo/bar/gcc-6.2.0/gcc/libexec/gcc/x86_64-redhat-linux/6.2.0/lto-wrapper
Target: x86_64-redhat-linux
Configured with: ../gcc-6.2.0/configure --prefix=/foo/bar/gcc-6.2.0/gcc --build=x86_64-redhat-linux --disable-multilib --enable-languages=c,c++,fortran --with-default-libstdcxx-abi=gcc4-compatible --enable-bootstrap --enable-threads=posix --with-long-double-128 --enable-long-long --enable-lto --enable-__cxa_atexit --enable-gnu-unique-object --with-system-zlib --enable-gold
Thread model: posix
gcc version 6.2.0 (GCC)

One of your flags is: 你的一面旗帜是:

--with-default-libstdcxx-abi=gcc4-compatible

and GCC4 does not support small string optimzation. 和GCC4 支持小串optimzation。


GCC5 started supporting it. GCC5开始支持它。 isocpp states: isocpp说:

A new implementation of std::string is enabled by default, using the small string optimization instead of copy-on-write reference counting. 默认情况下,使用小字符串优化而不是写入时复制引用计数来启用std :: string的新实现。

which supports my claim. 这支持我的主张。

Moreover, Exploring std::string mentions: 此外, 探索std :: string提到:

As we see, older libstdc++ implements copy-on-write, and so it makes sense for them to not utilize small objects optimization. 正如我们所见,较旧的libstdc ++实现了写时复制,因此它们不利用小对象优化是有意义的。

and then he changes context, when GCC5 comes in play. 当GCC5进场时,他会改变背景。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM