memcpy和memmove的意外性能

Question

Why does memcpy perform slower than memmove on my system? 为什么memcpy在我的系统上执行速度比memmove慢？

From reading other SO questions such as this or this Gives the impression that memcpy should work faster than memmove, and intuitively, this should be so. 通过阅读其他SO问题，比如这个或者这个问题给人的印象是memcpy应该比memmove更快，直观地说，这应该是这样。 After all, there are less checks that memcpy has and the man pages also match what they say. 毕竟，memcpy的检查次数较少，手册页也与他们所说的相符。

However, when measuring the time spent inside of each function, memmove beats memcpy! 但是，当测量每个函数内部所花费的时间时，memmove会记住memcpy！ What more, it seems to beat memset too, when memset seems like it could benefit from optimizations that memcpy or memmove can't. 更重要的是，它似乎也超过了memset，当memset似乎可以从memcpy或memmove无法实现的优化中受益。 Why would this be so? 为什么会这样？

Results (one of many) on my computer: 我的电脑上的结果（众多之一）：

[INFO] (ex23.c:151 func: main) Normal copy: 109092
[INFO] (ex23.c:198 func: main) memcpy: 66070
[INFO] (ex23.c:209 func: main) memmove: 53149
[INFO] (ex23.c:219 func: main) memset: 52451

Code used to give this result: 用于给出此结果的代码：

#include <stdio.h>
#include <string.h>
#include "dbg.h" // debugging macros
#include <time.h>

int main(int argc, char *argv[])
{
    char from[10000] = {'a'};
    char to[10000] = {'c'};
    int rc = 0;
    struct timespec before; 
    memset(from, 'x', 10000);
    memset(to, 'y', 10000);

    clock_gettime(CLOCK_REALTIME, &before);

    // naive assignment using a for loop
    normal_copy(from, to, 10000);
    struct timespec after;
    clock_gettime(CLOCK_REALTIME, &after);
    log_info("Normal copy: %ld", (after.tv_nsec - before.tv_nsec));


    memset(to, 'y', 10000);
    clock_gettime(CLOCK_REALTIME, &before); 
    memcpy(to, from, 10000);
    clock_gettime(CLOCK_REALTIME, &after);
    log_info("memcpy: %ld", (after.tv_nsec - before.tv_nsec));

    memset(to, 'y', 10000);
    clock_gettime(CLOCK_REALTIME, &before);
    memmove(to, from, 10000);
    clock_gettime(CLOCK_REALTIME, &after);
    log_info("memmove: %ld", (after.tv_nsec - before.tv_nsec));

    memset(to, 'y', 10000);
    clock_gettime(CLOCK_REALTIME, &before);
    memset(to, 'x', 10000);
    clock_gettime(CLOCK_REALTIME, &after);
    log_info("memset: %ld", (after.tv_nsec - before.tv_nsec));

    return 0;
}

Answer 1

As @Carl Norum and @Greg Hewgill say: cache effects. 正如@Carl Norum和@Greg Hewgill所说：缓存效果。

Your certainly experiencing the effects of cached memory. 你当然会体验到缓存内存的影响。 Re-order your tests and compare results. 重新排序测试并比较结果。 When I tested memcpy() before and after memmove() , the 2nd memcpy() performed like memove() and also was faster than the first memcpy() . 当我在memmove() memcpy()之前和之后测试memcpy() ，第二个memcpy()表现得像memove()并且比第一个memcpy()更快。

memcpy和memmove的意外性能

问题描述

1 个解决方案

解决方案1
1 已采纳 2013-08-11 22:40:10

memcpy和memmove的意外性能

问题描述

1 个解决方案

解决方案1 1 已采纳 2013-08-11 22:40:10

解决方案1
1 已采纳 2013-08-11 22:40:10