简体   繁体   English

有没有办法阻止 OpenCL 内核的执行?

[英]Is there any way to stop OpenCL kernel from execution?

Is there any way to stop OpenCL kernel from execution?有没有办法阻止 OpenCL 内核的执行? For example, I launch the kernel, do some computations, and then stop it if some conditions were met, otherwise, I wait until it finishes:例如,我启动内核,进行一些计算,然后在满足某些条件时停止它,否则,我会等到它完成:

clEnqueueNDRange(queue, ...); // start kernel function

// do other stuff...
// ...

if (some condition met) {
    stopKernel();
} else { 
    clFinish(queue);
}

Thank you for help谢谢你的帮助

No. Once you have enqueued your kernel, it will run to completion.不会。一旦您将内核加入队列,它就会运行完成。

One way to accomplish something like the above is to do this:实现类似上述的方法之一是要做到这一点:

while ( data_left_to_process ) {

   clEnqueueNDRangeKernel( ..., NDRange for a portion of the data, ... )

   // other work

   if (condition) {
      break;
   }

   // adjust NDRange for next execution to processes the next part of your data

}

clFinish(queue);

This allows you avoid processing ALL the data, with the obvious tradeoff that you're now submitting work in smaller chunks, which will probably have a performance impact.这允许您避免处理所有数据,明显的权衡是您现在以较小的块提交工作,这可能会对性能产生影响。

Possibly.可能。

  1. Create two command queues in a context.在上下文中创建两个命令队列。
  2. Create two kernels, one to do the work and another to halt execution.创建两个内核,一个执行工作,另一个停止执行。 Each kernel has access to a shared global buffer.每个内核都可以访问共享的全局缓冲区。
  3. Load the first kernel into queue1.将第一个内核加载到 queue1 中。
  4. Load the second kernel into queue2 when you want to halt execution.当您想停止执行时,将第二个内核加载到 queue2 中。

Alternatively you could use an out-of-order queue and load the second kernel into the the same command queue to halt execution.或者,您可以使用无序队列并将第二个内核加载到同一命令队列中以停止执行。 You have to be a little more careful (using clFinish/clFlush as necessary), however it is a more natural way of doing this.您必须更加小心(根据需要使用 clFinish/clFlush),但这是一种更自然的方式。

Some pseudo code (for multiple queues):一些伪代码(用于多个队列):

clEnqueueNDRange(queue1, kernel1, ...); //start work kernel
// do other stuff
// ...
if (condition)
    clEnqueueNDRange(queue2, kernel2, ...); //stop work kernel
clFinish(queue1);
clFinish(queue2); // entirely unnecessary if using in-order queues

Use a buffer of ints or floats as your stopping variable and access them via the global_id within your kernels to reduce the cost of reading from global within a loop.使用整数或浮点数缓冲区作为停止变量,并通过内核中的 global_id 访问它们,以降低在循环中从全局读取的成本。 The downside is that your state will be indeterminate: without further variables to count executions etc, you won't know how many work items and which ones have been executed.缺点是您的状态将是不确定的:没有其他变量来计算执行等,您将不知道有多少工作项以及哪些已执行。

And the kernels:和内核:

void kernel1( ... ,global int * g_stop)
{
    int index_from_ids = ...;
    while (g_stop[index_from_ids] == 0) // or 'if' for single pass execution
    {
        // do work here
    }
}

void kernel2( ... ,global int * g_stop)
{
    int index_from_ids = ...;
    g_stop[index_from_ids] = 1;
}

A way to do it is to do the work load in chunks, so if you have a 10000X10000 global worker for example like this:一种方法是分块完成工作负载,因此如果您有一个 10000X10000 全局工作器,例如:

clEnqueueNDRangeKernel(queue, kernel, 2, NDRange(0,0), NDRange(10000,10000),... );

You can do it in chunks, like so:您可以分块进行,如下所示:

for(int i=0; i<100; i++)
    for(int j=0; j<100; j++)
         if(condition)
             clEnqueueNDRangeKernel(queue, kernel, 2, NDRange(i*100,j*100),DRange(100,100),... );

You might need to call queuefinish in the loop in some cases.在某些情况下,您可能需要在循环中调用 queuefinish。 This has other advantages, such as not getting a timeout in hardware that terminates applications that take too long, like nvidia's watchdog timer, and it also allows you to implement a loading bar in your GUI if you need one.这还有其他优点,例如不会在硬件中超时以终止花费太长时间的应用程序,例如 nvidia 的看门狗计时器,并且如果需要,它还允许您在 GUI 中实现加载栏。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM