I am converting an image from colour to grayscale using CUDA 5 / VC 2008.
The CUDA kernel is:
__global__ static void rgba_to_grayscale( const uchar4* const rgbaImage, unsigned char * const greyImage,
int numRows, int numCols)
{
int pos = blockIdx.x * blockDim.x + threadIdx.x;
if (pos < numRows * numCols) {
uchar4 zz = rgbaImage[pos];
float out = 0.299f * zz.x + 0.587f * zz.y + 0.114f * zz.z;
greyImage[pos] = (unsigned char) out;
}
}
The C++ function is:
inline unsigned char rgba_to_grayscale( uchar4 rgbaImage)
{
return (unsigned char) 0.299f * rgbaImage.x + 0.587f * rgbaImage.y + 0.114f * rgbaImage.z;
}
and they are both called appropriately. However they are yielding different results.
Can anybody explain why the results are different?
There is no problem with your CUDA function. The CPU version is incorrect. You are typecasting the value 0.299f * rgbaImage.x
to unsigned char
which is equivalent to the following code:
inline unsigned char rgba_to_grayscale( uchar4 rgbaImage)
{
return ((unsigned char) 0.299f * rgbaImage.x) + 0.587f * rgbaImage.y + 0.114f * rgbaImage.z;
}
You have to cast the final result into unsigned char
like this:
inline unsigned char rgba_to_grayscale( uchar4 rgbaImage)
{
return (unsigned char) (0.299f * rgbaImage.x + 0.587f * rgbaImage.y + 0.114f * rgbaImage.z);
}
@sga91 was pretty much there.... but is also appears that the byte order is different.
inline unsigned char rgba_to_grayscale( uchar4 rgbaImage)
{
return (unsigned char) (0.299f * rgbaImage.z + 0.587f * rgbaImage.y + 0.114f * rgbaImage.y);
}
note that the x and z are transposed....
I do remember reading about it before but I cannot find the reference now...
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.