简体   繁体   English

内存复制速度比较CPU <-> GPU

[英]Memory copy speed comparison CPU<->GPU

I am now learning boost::compute openCL wrapper library. 我现在正在学习boost :: compute openCL包装器库。 I am experiencing very slow copy procedure. 我的复制过程很慢。

If we scale CPU to CPU copy speed as 1, how fast is GPU to CPU, GPU to GPU, CPU to GPU copy? 如果将CPU到CPU的复制速度缩放为1,GPU到CPU,GPU到GPU,CPU到GPU的复制速度有多快?

I don't require precise numbers. 我不需要精确数字。 Just a general idea would be a great help. 只是一个一般的想法将是一个很大的帮助。 In example CPU-CPU is at least 10 times faster than GPU-GPU. 在示例中,CPU-CPU至少比GPU-GPU快10倍。

No one is answering my question. 没有人在回答我的问题。 So I made a program to check the copy speed. 因此我制作了一个程序来检查复印速度。

#include<vector>
#include<chrono>
#include<algorithm>
#include<iostream>
#include<boost/compute.hpp>
namespace compute = boost::compute;
using namespace std::chrono;
using namespace std;

int main()
{
    int sz = 10000000;
    std::vector<float> v1(sz, 2.3f), v2(sz);
    compute::vector<float> v3(sz), v4(sz);

    auto s = system_clock::now();
    std::copy(v1.begin(), v1.end(), v2.begin());
    auto e = system_clock::now();
    cout << "cpu2cpu cp " << (e - s).count() << endl;

    s = system_clock::now();
    compute::copy(v1.begin(), v1.end(), v3.begin());
    e = system_clock::now();
    cout << "cpu2gpu cp " << (e - s).count() << endl;

    s = system_clock::now();
    compute::copy(v3.begin(), v3.end(), v4.begin());
    e = system_clock::now();
    cout << "gpu2gpu cp " << (e - s).count() << endl;

    s = system_clock::now();
    compute::copy(v3.begin(), v3.end(), v1.begin());
    e = system_clock::now();
    cout << "gpu2cpu cp " << (e - s).count() << endl;
    return 0;
}

I expected that gpu2gpu copy would be fast. 我期望gpu2gpu复制会很快。 But on the contrary, cpu2cpu was fastest and gpu2gpu was so slow in my case. 但是相反,在我的情况下,cpu2cpu最快,而gpu2gpu如此慢。 (My system is Intel I3 and Intel(R) HD Graphics Skylake ULT GT2.) Maybe parallel processing is one thing and copy speed is another. (我的系统是Intel I3和Intel HD Graphics Skylake ULT GT2。)也许并行处理是一回事,而复制速度是另一回事。

cpu2cpu cp 7549776 cpu2cpu cp 7549776
cpu2gpu cp 18707268 cpu2gpu cp 18707268
gpu2gpu cp 65841100 gpu2gpu cp 65841100
gpu2cpu cp 65803119 gpu2cpu cp 65803119

I hope anyone can benefit with this test program. 我希望任何人都可以从该测试程序中受益。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM