简体   繁体   中英

OpenCL — Different kernel “printf()” results on different devices?

I've got a peculiar result from running a hello_world kernel that simply prints a buffer passed via the command queue. I am getting two different results from different devices on the same platform. See the bottom of the console output below:

Here is my kernel code:

__kernel void hello_world (__global char* message, int messageSize) {
    for (int i =0; i < messageSize; i++) {
        printf("%c", message[i]);
    }
}

and here is my function call:

  std::string message = "Hello World!";
  int messageSize = message.length();
  std::cout << "          ---> Creating Buffer... ";
  cl::Buffer buffer(CL_MEM_READ_ONLY | CL_MEM_COPY_HOST_PTR, sizeof(char) * messageSize, (char*)message.c_str());

  kernel.setArg(0,buffer);
  kernel.setArg(1,sizeof(int),&messageSize);
  std::cout << "Done!" << std::endl;

  for (cl_uint i = 0; i<m_deviceCount[m_currentPlatform]; i++) {
        std::cout << "          ---> Queuing Kernel Task on Device #"<< m_currentPlatform << "." << i << "... ";
        m_commandQueues[i].enqueueTask(kernel);
        std::cout << "Done!" << std::endl;
        std::cout << "          ---> Executing... Output:\n\n";
        m_commandQueues[i].finish();
        std::cout << "\n\n          ---> Done!" << std::endl;
    }

And my console output:

Found 1 Platforms

Platform #0:
  Name:  AMD Accelerated Parallel Processing
  Found  2 Devices
      Device #0.0:
          --> Name:               Juniper
          --> Vendor:             Advanced Micro Devices, Inc.
          --> Max Compute Units:  10
          --> Max Clock Freq:     850
          --> Global Mem Size:    512 MBs
          --> Local Mem Size:     32 KBs
          --> Hardware Version:   OpenCL 1.2 AMD-APP (1800.11)
          --> Software Version:   1800.11
          --> Open CL Version:    OpenCL C 1.2 
          --> Images Supported:   YES
      Device #0.1:
          --> Name:               Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz
          --> Vendor:             GenuineIntel
          --> Max Compute Units:  8
          --> Max Clock Freq:     3796
          --> Global Mem Size:    15905 MBs
          --> Local Mem Size:     32 KBs
          --> Hardware Version:   OpenCL 1.2 AMD-APP (1800.11)
          --> Software Version:   1800.11 (sse2,avx)
          --> Open CL Version:    OpenCL C 1.2 
          --> Images Supported:   YES

    Using Platform With Most Available Devices: Platform #0
          ---> Creating Context.... Done!
          ---> Creating Command Queue for Device #0.0.... Done!
          ---> Creating Command Queue for Device #0.1.... Done!
          ---> Loading Program: hello_world.cl...
                  > Compiling... Done!
          ---> Creating Buffer... Done!

          ---> Queuing Kernel Task on Device #0.0... Done!
          ---> Executing... Output:

H(null)e(null)l(null)l(null)o(null) (null)W(null)o(null)r(null)l(null)d(null)!(null)

          ---> Done!
          ---> Queuing Kernel Task on Device #0.1... Done!
          ---> Executing... Output:

Hello World!

          ---> Done!

Does anyone know why the AMD GPU inserts "(null)" in between characters, while the Intel CPU not? Is that normal for AMD's implementation of OpenCL?

I also tried to pull off the printf in kernel. You can refer to my program at:

https://github.com/pradyotsn/opencl_printf

Thank you

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM