简体   繁体   中英

Swscale - image patch (NV12) color conversion - invalid border

The goal is to convert NV12 to BGR24 image, more exactly an image patch (x:0, y:0, w:220, h:220).
The issue is the undefined pixel column on the right of the converted patch as shown: 在此输入图像描述

The question is why is this happening (even though the coordinates and the dimensions of the patch have even values) ? (Interestingly enough for an odd width value, that issue is not present)


The patch has the following bounding box: (x:0, y:0, w:220, h:220).
The behavior should be reproducible with any image. Conversion can be done using the ppm conversion page .

The following code creates a nv12 image from a bgr24 image and then converts a nv12 patch back to bgr24 patch. If everything worked properly the output should have been identical to a source image.

#include <libswscale/swscale.h>
#include <libavutil/imgutils.h>

void readPPM(const char* filename, uint8_t** bgrData, int* stride, int* w, int* h)
{
    FILE* fp = fopen(filename, "rb");
    fscanf(fp, "%*s\n"); //skip format check

    fscanf(fp, "%d %d\n", w, h);
    fscanf(fp, "%*d\n"); //skip max value check

    *stride = *w * 3;
    *bgrData = av_malloc(*h * *stride);

    for (int r = 0; r < *h; r++)
    {
        uint8_t* rowData = *bgrData + r * *stride;
        for (int c = 0; c < *w; c++)
        {
            //rgb -> bgr
            fread(&rowData[2], 1, 1, fp);
            fread(&rowData[1], 1, 1, fp);
            fread(&rowData[0], 1, 1, fp);

            rowData += 3;
        }
    }

    fclose(fp);
}

void writePPM(const char* filename, uint8_t* bgrData, int stride, int w, int h)
{
    FILE* fp = fopen(filename, "wb");
    fprintf(fp, "P6\n");
    fprintf(fp, "%d %d\n", w, h);
    fprintf(fp, "%d\n", 255);

    for (int r = 0; r < h; r++)
    {
        uint8_t* rowData = bgrData + r * stride;
        for (int c = 0; c < w; c++)
        {
            //bgr -> rgb
            fwrite(&rowData[2], 1, 1, fp);
            fwrite(&rowData[1], 1, 1, fp);
            fwrite(&rowData[0], 1, 1, fp);

            rowData += 3;       
        }
    }

    fclose(fp);
}


void bgrToNV12(uint8_t* srcData[4], int srcStride[4], 
               uint8_t* tgtData[4], int tgtStride[4],
               int w, int h)
{
    struct SwsContext* context = sws_getContext(w, h, AV_PIX_FMT_BGR24,
                                                w, h, AV_PIX_FMT_NV12, SWS_POINT, NULL, NULL, NULL);
    {
        sws_scale(context,
                  srcData, srcStride, 0, h,
                  tgtData, tgtStride);
    }
    sws_freeContext(context);
}

void nv12ToBgr(uint8_t* srcData[4], int srcStride[4],
               uint8_t* tgtData[4], int tgtStride[4],
               int w, int h)
{
    struct SwsContext* context = sws_getContext(w, h, AV_PIX_FMT_NV12,
                                                w, h, AV_PIX_FMT_BGR24, SWS_POINT, NULL, NULL, NULL);
    {
        sws_scale(context,
                  srcData, srcStride, 0, h,
                  tgtData, tgtStride);
    }
    sws_freeContext(context);
}


int main()
{
    //load BGR image
    uint8_t* bgrData[4]; int bgrStride[4]; int bgrW, bgrH;
    readPPM("sample.ppm", &bgrData[0], &bgrStride[0], &bgrW, &bgrH);

    //create NV12 image from the BGR image
    uint8_t* nv12Data[4]; int nv12Stride[4];
    av_image_alloc(nv12Data, nv12Stride, bgrW, bgrH, AV_PIX_FMT_NV12, 16);
    bgrToNV12(bgrData, bgrStride, nv12Data, nv12Stride, bgrW, bgrH);

    //convert nv12 patch to bgr patch
    nv12ToBgr(nv12Data, nv12Stride, bgrData, bgrStride, 220, 220);   //invalid result (random column stripe)
    //nv12ToBgr(nv12Data, nv12Stride, bgrData, bgrStride, 221, 220); //valid result

    //save bgr image (should be exactly as original BGR image)
    writePPM("sample-out.ppm", bgrData[0], bgrStride[0], bgrW, bgrH);

    //cleanup
    av_freep(bgrData);
    av_freep(nv12Data);
    return 0;
}

sws_scale makes a color conversion and scaling at the same time.

Most of the used algorithms need to include neighboring pixels in the calculation of a target pixel. Of course, this could lead to problems at the edges if the image dimensions are not a multiple of x. Where x depends on the used algorithms.

If you set the image dimensions here to a multiple of 8 (next multiple of 8 = 224), then it works without artifacts.

nv12ToBgr(nv12Data, nv12Stride, bgrData, bgrStride, 224, 224);

Demo

Using image dimensions 220 x 220 on the left, gives an artifact on the right edge of the converted patch.

If one chooses 224 x 224 it does not give an artifact, see the right image in the screenshot comparing both procedures.

对照

Theoretically Required Minimum Alignment

Let's take a look at the YVU420 format:

The luma values are determined for each pixel. The color information, which is divided into Cb and Cr, is calculated from a 2x2 pixel block. The minimum image size would therefore be a 2 x 2 image block resulting in 6 bytes (ie 12 pixels per byte = 12 * 4 = 48bit = 6 bytes), see graphic here:

YUV420

The minimum technical requirement is therefore an even width and height of the image.

You have defined the SWS_POINT flag for scaling, ie the nearest neighbor method is used. So theoretically for each output pixel the nearest input pixel is determined and used, which does not cause any alignment restrictions.

Performance

But an important aspect of the actual implementations of algorithms, however, is often performance. In this case, eg several adjacent pixels could be processed at once. Also do not forget the possibility of hardware-accelerated operations.

Alternative solution

If for some reason you need to stick to a 220x220 format, you can alternatively use the SWS_BITEXACT flag.

It does:

Enable bitexact output.

see https://ffmpeg.org/ffmpeg-scaler.html#scaler_005foptions

So in nv12ToBgr you would use something like:

struct SwsContext* context = sws_getContext(w, h, AV_PIX_FMT_NV12,
                                            w, h, AV_PIX_FMT_BGR24, SWS_POINT | SWS_BITEXACT, NULL, NULL, NULL);

This doesn't give any artifacts either. If you have to convert a lot of frames, I would take a look at the performance.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM