I have a C++ input that is formed as a std::vector and I would like to pass it to my OpenCL kernel (Nvidia platform).
But I keep getting a segmentation fault when the following is executed:
queue.enqueueReadBuffer(dev_input, CL_TRUE, 0, sizeof(bool) * input.size(), &input);
Therefore, I tried to copy my std::vector<bool>
to a bool[]
and everything worked perfectly. However, the common methods to convert a vector to a C array ( &input[0], input.data())
don't work at all.
Would you have any suggestions either for the ReadBuffer or the fast assignment to a C array?
Thank you!
Two problems.
Implementations can (optionally) optimize std::vector<bool>
to be more space efficient, perhaps by packing values into individual bits of memory. This will not match the layout of an array of bool
.
You can't pass the address of a vector as if it were an array.
std::vector<bool> input; queue.enqueueReadBuffer( dev_input, CL_TRUE, 0, sizeof(bool) * input.size(), &input); // ^^^^^^ wrong
If you want to pass a vector as a pointer, you have to either use input.data()
or something like &input[0]
. The reason these don't work is because std::vector<bool>
is designed to prevent it, because the implementation might be optimized (see point #1).
This is basically a "wart" in the C++ library that has gotten baked in by time.
Fixing this is a pain.
// One way to do things...
struct Bool { bool value; }
std::vector<Bool> input;
// Another way to do things...
// (You have to choose the right type here, it depends
// on your platform)
std::vector<int> input;
// Yet another way...
bool *input = new bool[N];
That's why this is such a big stinking wart.
One possibility is to use the copy()
algorithm provided by Boost.Compute .
Basically, you can do something like this:
// vector of bools on the host
std::vector<bool> host_vec = { ... };
// vector of bools on the device (each represented with a uchar)
boost::compute::vector<uchar_> gpu_vec(host_vec.size(), ctx);
// copy bool values on host to the device
boost::compute::copy(host_vec.begin(), host_vec.end(), gpu_vec.begin(), queue);
And Boost.Compute will take care of copying the data to the compute device correctly.
One more thing to add:
Type bool
within OpenCL kernel isn't guaranteed to match any of the host-side types. Generally, you should avoid using boolean types in Host-Device interconnection.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.