简体   繁体   中英

Alignment and performance

Routines strcmp for comparing char * and memcmp for everything else, do they run faster on memory block (on x86_64) which is somehow aligned (how?)? Does libc use SSE for this routines?

It depends, but on architectures where alignment matters or where SIMD instructions are available, typically the routines will operate on leading bytes, then do as many wide aligned operations as the data allows, then operate on trailing bytes.

Whether the leading and trailing bytes are contributing significantly to the processing time for your data can be determined by experiment.

If you worry about performance for comparison, you should take a look at well-known Boyer-Moore alogrithm and this post from GNU Grep author, Mike Haertel.

He explains how one can manage to be really fast about searching something in a data block.

His summary is quite clear about what to do :

  • Use Boyer-Moore (and unroll its inner loop a few times).
  • Roll your own unbuffered input using raw system calls. Avoid copying the input bytes before searching them. (Do, however, use buffered output . The normal grep scenario is that the amount of output is small compared to the amount of input, so the overhead of output buffer copying is small, while savings due to avoiding many small unbuffered writes can be large.)
  • Don't look for newlines in the input until after you've found a match.
  • Try to set things up (page-aligned buffers, page-sized read chunks, optionally use mmap) so the kernel can ALSO avoid copying the bytes.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM