简体   繁体   中英

(C++)(Visual Studio) Change RGB to Grayscale

I am accessing the image like so:

pDoc = GetDocument();

int iBitPerPixel = pDoc->_bmp->bitsperpixel;    // used to see if grayscale(8 bits) or RGB (24 bits)
int iWidth = pDoc->_bmp->width;
int iHeight = pDoc->_bmp->height;
BYTE *pImg = pDoc->_bmp->point;     // pointer used to point at pixels in the image
int Wp = iWidth;
const int area = iWidth * iHeight;
int r;          // red pixel value
int g;          // green pixel value
int b;          // blue pixel value
int gray;       // gray pixel value

BYTE *pImgGS = pImg;                    // grayscale image pixel array

and attempting to change the rgb image to gray like so:

    // convert RGB values to grayscale at each pixel, then put in grayscale array
    for (int i = 0; i<iHeight; i++)
        for (int j = 0; j<iWidth; j++)
        {
            r = pImg[i*iWidth * 3 + j * 3 + 2];
            g = pImg[i*iWidth * 3 + j * 3 + 1];
            b = pImg[i*Wp + j * 3];

            r * 0.299;
            g * 0.587;
            b * 0.144;

            gray = std::round(r + g + b);

            pImgGS[i*Wp + j] = gray;
        }

finally, this is how I try to draw the image:

//draw the picture as grayscale
for (int i = 0; i < iHeight; i++) {
    for (int j = 0; j < iWidth; j++) {
        // this should set every corresponding grayscale picture to the current picture as grayscale
        pImg[i*Wp + j] = pImgGS[i*Wp + j];
    }
}
}

original image: 在此处输入图像描述 and the resulting image that I get is this: 在此处输入图像描述

First check if image type is 24 bits per pixels. Second, allocate memory to pImgGS;

BYTE* pImgGS = (BTYE*)malloc(sizeof(BYTE)*iWidth *iHeight); 

Please refer this article to see how bmp data is saved. bmp images are saved upside down. Also, first 54 byte of information is BITMAPFILEHEADER. Hence you should access values in following way,

double r,g,b;
unsigned char gray;
for (int i = 0; i<iHeight; i++)
{
     for (int j = 0; j<iWidth; j++)
    {
        r = (double)pImg[(i*iWidth + j)*3 + 2];
        g = (double)pImg[(i*iWidth + j)*3 + 1];
        b = (double)pImg[(i*iWidth + j)*3 + 0];

         r= r * 0.299;
         g= g * 0.587;
         b= b * 0.144;

         gray = floor((r + g + b + 0.5));

         pImgGS[(iHeight-i-1)*iWidth + j] = gray;
     }
}

If there is padding present, then first determine padding and access in different way. Refer this to understand pitch and padding.

double r,g,b;
unsigned char gray;
long index=0;
for (int i = 0; i<iHeight; i++)
{
     for (int j = 0; j<iWidth; j++)
    {
        r = (double)pImg[index+ (j)*3 + 2];
        g = (double)pImg[index+ (j)*3 + 1];
        b = (double)pImg[index+ (j)*3 + 0];

         r= r * 0.299;
         g= g * 0.587;
         b= b * 0.144;

         gray = floor((r + g + b + 0.5));

         pImgGS[(iHeight-i-1)*iWidth + j] = gray;
     }
index =index +pitch;
}

While drawing image, as pImg is 24bpp, you need to copy gray values thrice to each R,G,B channel. If you ultimately want to save grayscale image in bmp format, then again you have to write bmp data upside down or you can simply skip that step in converting to gray here:

pImgGS[(iHeight-i-1)*iWidth + j] = gray;

tl; dr:

Make one common path. Convert everything to 32-bits in a well-defined manner, and do not use image dimensions or coordinates. Refactor the YCbCr conversion ( = grey value calculation) into a separate function, this is easier to read and runs at exactly the same speed.

The lengthy stuff

First, you seem to have been confused with strides and offsets. The artefact that you see is because you accidentially wrote out one value (and in total only one third of the data) when you should have written three values.
One can get confused with this easily, but here it happened because you do useless stuff that you needed not do in the first place. You are iterating coordinates left to right, top-to-bottom and painstakingly calculate the correct byte offset in the data for each location.
However, you're doing a full-screen effect, so what you really want is iterate over the complete image. Who cares about the width and height? You know the beginning of the data, and you know the length. One loop over the complete blob will do the same, only faster, with less obscure code, and fewer opportunities of getting something wrong.

Next, 24-bit bitmaps are common as files, but they are rather unusual for in-memory representation because the format is nasty to access and unsuitable for hardware. Drawing such a bitmap will require a lot of work from the driver or the graphics hardware (it will work, but it will not work well). Therefore, 32-bit depth is usually a much better, faster, and more comfortable choice. It is much more "natural" to access program-wise.
You can rather trivially convert 24-bit to 32-bit. Iterate over the complete bitmap data and write out a complete 32-bit word for each 3 byte-tuple read. Windows bitmaps ignore the A channel (the highest-order byte), so just leave it zero, or whatever.

Also, there is no such thing as a 8-bit greyscale bitmap. This simply doesn't exist. Although there exist bitmaps that look like greyscale bitmaps, they are in reality paletted 8-bit bitmaps where (incidentially) the bmiColors member contains all greyscale values.

Therefore, unless you can guarantee that you will only ever process images that you have created yourself, you cannot just rely that eg the values 5 and 73 correspond to 5/255 and 73/255 greyscale intensity, respectively. That may be the case, but it is in general a wrong assumption.
In order to be on the safe side as far as correctness goes, you must convert your 8-bit greyscale bitmaps to real colors by looking up the indices (the bitmap's grey values are really indices) in the palette. Otherwise, you could be loading a greyscale image where the palette is the other way around (so 5 would mean 250 and 250 would mean 5), or a bitmap which isn't greyscale at all.

So... you want to convert 24-bit and you want to convert 8-bit bitmaps, both to 32-bit depth. That means you do all the annoying what-if stuff once at the beginning, and the rest is one identical common path. That's a good thing.
What you will be showing on-screen is always a 32-bit bitmap where the topmost byte is ignored, and the lower three are all the same value, resulting in what looks like a shade of grey. That's simple, and simple is good.

Note that if you do a BT.601 style YCbCr conversion (as indicated by your use of the constants 0.299, 0.587, and 0.144), and if your 8-bit greyscale images are perceptive (this is something you must know, there is no way of telling from the file!), then for 100% correctness, you need to to the inverse transformation when converting from paletted 8-bit to RGB. Otherwise, your final result will look like almost right , but not quite. If your 8-bit greycales are linear, ie were created without using the above constants (again, you must know, you cannot tell from the image), you need to copy everything as-is (here, doing the conversion would make it look almost-but-not-quite right).

About the RGB-to-greyscale conversion, you do not need an extra greyscale bitmap just to hold the values that you never need again afterwards. You can read the three color values from the loaded bitmap, calculate Y, and directly build the 32-bit ARGB word, which you then write out to the final bitmap. This saves one entirely useless round-trip to memory which is not necessary.

Something like this:

uint32_t* out = (uint32_t*) output_bitmap_data;

for(int i = 0; i < inputSize; i+= 3)
{
    uint8_t Y = calc_greyscale(in[0], in[1], in[2]);
    *out++ = (Y<<16) | (Y<<8) | Y;
}

Alternatively, you can also do the from-whatever-to-32 conversion, and then do the to-greyscale conversion in-place there. This, in turn, introduces an extra round-trip to memory, but the code becomes much, much easier overall.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM