简体   繁体   中英

Use GPU and CPU wisely

I'm newbie for OpenCL, just started learning. I wanted to know whether it is possible to execute few threads on GPU and remaining threads on CPU? In other words, if I launch 100 threads and assume that I've 8 core CPU then is it possible that 8 threads out of 100 threads will execute on CPU and remaining 92 threads will run on GPU?Can OpenCL help me to do this job smoothly?

I wanted to know whether it is possible to execute few threads on GPU and remaining threads on CPU?

Yes

In other words, if I launch 100 threads and assume that I've 8 core CPU then is it possible that 8 threads out of 100 threads will execute on CPU and remaining 92 threads will run on GPU?

No. That description suggests that you'd be viewing the GPU & CPU as a single compute resource. You can't do that.

That doesn't mean you can't have both working on the same task.

  • The GPU and CPU will be considered to be separate OpenCL devices.
  • You can write code that can talk to multiple devices.
  • You can compile the same kernel for multiple devices.
  • You can ask for multiple devices to do work at the same time.

...but...

  • None of this is automatic.
  • OpenCL won't split a single NDRange (or equivalent) call between multiple devices.
  • This means you'd have to schedule tasks between the two devices yourself.
  • There's going to be quite a large disparity in speed, so keeping it optimal will require more than "92 here, 8 there".

What I've found works better is having the CPU work on a different task whilst the GPU is working. Maybe preparing the next piece of work for the GPU, or post-processing the results from the GPU. Sometimes this is normal code. Sometimes it's OpenCL.

You can use multiple openCL devices to work on your algorithm, but the workload needs to be partitioned subtly enough so the work across devices is balanced properly, or else the overhead may make your runtime worse.

It is stated clearly in the AMD OpenCL Programming Guide section 4.7 about using multiple OpenCL devices, so my answer is, yes, you can divide the work to be executed with multiple devices, smoothly, if and only if your scheduling algorithm is smart enough to balance the whole thing.

openCL code is compiled at run time for the selected device (CPU, model of GPU)

You can switch which target you use for different tasks but you can't (with any implementation I know of) split the same task between CPU and GPU

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM