简体   繁体   中英

Just out of curiosity: how come linux kernel “optimized” strcpy is much slower the libc imp?

I tried to benchmark optimized string operations under http://lxr.linux.no/#linux+v2.6.38/arch/x86/lib/string_32.c and compare to regular strcpy:

#include<stdio.h>
#include<stdlib.h>
char *_strcpy(char *dest, const char *src)
{
        int d0, d1, d2;
        asm volatile("1:\tlodsb\n\t"
                "stosb\n\t"
                "testb %%al,%%al\n\t"
                "jne 1b"
                : "=&S" (d0), "=&D" (d1), "=&a" (d2)
                : "0" (src), "1" (dest) : "memory");
        return dest;
}
int main(int argc, char **argv){
        int times = 1;
        if(argc >1)
        {
                times = atoi(argv[1]);
        }
        char a[100];
        for(; times; times--)
          _strcpy(a, "Hello _strcpy!");


        return 0;
}

and timeing it using (time .. ) showed that it is about x10 slower than regular strcpy (under x64 linux)

Why?

If your string is constant, it's possible that the compiler is inlining the copy (for the plain strcpy call), making it into a series of unconditional MOV instructions. since this is linear code without conditions, it would be faster than the linux variant.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM