sws_scale generates malformed video

Question

I have to encode a series of frames from CAIRO_FORMAT_ARGB32 to AV_PIX_FMT_YUV420P with sws_scale. From the ffmpeg docs I came to know the AV equivalent of the source format is AV_PIX_FMT_ARGB so here is my code:

 // Set up conversion context
    img->sws_ctx = sws_getCachedContext(
        img->sws_ctx,
        img->video_size[0],
        img->video_size[1],
        AV_PIX_FMT_ARGB,
        img->video_size[0],
        img->video_size[1],
        AV_PIX_FMT_YUV420P,
        SWS_BILINEAR,
        NULL,
        NULL,
        NULL);

    width  = cairo_image_surface_get_width( surface );
    height = cairo_image_surface_get_height( surface );
    stride = cairo_image_surface_get_stride( surface );
    pix    = cairo_image_surface_get_data( surface );
    const int in_linesize[1] = { stride };
    
    sws_scale(  img->sws_ctx, (const uint8_t * const *) &pix, in_linesize, 0,
                img->video_size[1], img->video_frame->data, img->video_frame->linesize);
    img->video_frame->pts++;

Sadly the video doesn't play and VLC shows a bunch of these useless messages:

[h264 @ 0x7f6ce0cbc1c0] mmco: unref short failure
[h264 @ 0x7f6ce0c39a80] co located POCs unavailable
[h264 @ 0x7f6ce0c82800] co located POCs unavailable
[h264 @ 0x7f6ce0c9f400] mmco: unref short failure

The encoding process runs just fine. I also tried with const int in_linesize[1] = { 3 * width }; Where am I wrong?

Answer 1

The following answer shows how to use sws_scale for converting ARGB to YUV420p.
You have to make some adaptations for integration the conversion in your code.

The code sample is "stand alone" sws_scale example, that doesn't use CAIRO.

Create BGRA input sample using FFmpeg (command line tool):

ffmpeg -y -f lavfi -i testsrc=size=192x108:rate=1 -vcodec rawvideo -pix_fmt argb -frames 1 -f rawvideo argb_image.bin

The following code sample applies the following stages:

Read ARGB (sample) input from binary file.
Allocate memory buffers from storing the YUV420p output.
Get SWS context.
Apply color conversion.
Write YUV420p output image to binary file (for testing).
Free allocated memory.

C++ code sample:

#include <stdio.h>
#include <string.h>
#include <stdint.h>

extern "C"
{
#include <libswscale/swscale.h>
#include <libavutil/imgutils.h>
}


int main()
{
    //Use FFmpeg for building raw ARGB image (used as input).
    //ffmpeg -y -f lavfi -i testsrc=size=192x108:rate=1 -vcodec rawvideo -pix_fmt argb -frames 1 -f rawvideo argb_image.bin
    
    const int width = 192;
    const int height = 108;
    unsigned char* argb_in = new uint8_t[width * height * 4];   //Allocate 4 bytes per pixel (applies ARGB)

    const enum AVPixelFormat out_pix_fmt = AV_PIX_FMT_YUV420P;

    //Read input image for binary file (for testing)
    ////////////////////////////////////////////////////////////////////////////
    FILE* f = fopen("argb_image.bin", "rb"); //For using fopen in Visual Studio, define: _CRT_SECURE_NO_WARNINGS (or use fopen_s).
    fread(argb_in, 1, width * height * 4, f);
    fclose(f);
    ////////////////////////////////////////////////////////////////////////////    

    //Allocate output buffers:
    ////////////////////////////////////////////////////////////////////////////
    // YUV420p data is separated in three planes
    // 1. Y - intensity plane, resolution: width x height
    // 2. U - Color plane, resolution: width/2 x height/2
    // 3. V - Color plane, resolution: width/2 x height/2

    int out_linesize[4] = {0, 0, 0, 0};
    uint8_t* out_planes[4] = { nullptr, nullptr, nullptr, nullptr };   

    int sts = av_image_alloc(out_planes,    //uint8_t * pointers[4], 
                             out_linesize,  //int linesizes[4], 
                             width,         //int w, 
                             height,        //int h, 
                             out_pix_fmt,   //enum AVPixelFormat pix_fmt, 
                             32);           //int align);   //Align to 32 bytes address may result faster execution time compared to 1 byte aligenment.

    if (sts < 0)
    {
        printf("Error: av_image_alloc response = %d\n", sts);
        return -1;
    }
    ////////////////////////////////////////////////////////////////////////////

    
    
    //Get SWS context
    ////////////////////////////////////////////////////////////////////////////
    struct SwsContext* sws_context = nullptr;

    sws_context = sws_getCachedContext(sws_context,         //struct SwsContext *context,
                                       width,               //int srcW,
                                       height,              //int srcH,
                                       AV_PIX_FMT_ARGB,     //enum AVPixelFormat srcFormat,
                                       width,               //int dstW,
                                       height,              //int dstH,
                                       out_pix_fmt,         //enum AVPixelFormat dstFormat,
                                       SWS_BILINEAR,        //int flags,
                                       nullptr,             //SwsFilter *srcFilter,
                                       nullptr,             //SwsFilter *dstFilter,
                                       nullptr);            //const double *param);

    if (sws_context == nullptr)
    {
        printf("Error: sws_getCachedContext returned nullptr\n");
        return -1;
    }
    ////////////////////////////////////////////////////////////////////////////


    //Apply color conversion
    ////////////////////////////////////////////////////////////////////////////
    const int in_linesize[1] = { 4 * width }; // ARGB stride (4 bytes per pixel - assume data is continuous).
    const uint8_t* in_planes[1] = { argb_in };

    int response = sws_scale(sws_context,   //struct SwsContext *c, 
                             in_planes,     //const uint8_t *const srcSlice[],
                             in_linesize,   //const int srcStride[], 
                             0,             //int srcSliceY, 
                             height,        //int srcSliceH,
                             out_planes,    //uint8_t *const dst[], 
                             out_linesize); //const int dstStride[]);


    if (response < 0)
    {
        printf("Error: sws_scale response = %d\n", response);
        return -1;
    }
    ////////////////////////////////////////////////////////////////////////////


    //Write YUV420p output image to binary file (for testing)
    //You may execute FFmpeg after conversion for testing the output:
    //ffmpeg -y -f rawvideo -s 192x108 -pixel_format yuv420p -i yuv420p_image.bin rgb.png
    ////////////////////////////////////////////////////////////////////////////
    f = fopen("yuv420p_image.bin", "wb");
    fwrite(out_planes[0], 1, width * height, f);
    fwrite(out_planes[1], 1, width * height / 4, f);
    fwrite(out_planes[2], 1, width * height / 4, f);
    fclose(f);
    ////////////////////////////////////////////////////////////////////////////


    //Free allocated memory
    ////////////////////////////////////////////////////////////////////////////
    av_freep(out_planes);
    sws_freeContext(sws_context);
    delete[] argb_in;
    ////////////////////////////////////////////////////////////////////////////

    return 0;
}

For testing the output, convert yuv420p_image.bin to PNG image using FFmpeg:

ffmpeg -y -f rawvideo -s 192x108 -pixel_format yuv420p -i yuv420p_image.bin rgb.png

rgb.png (result of FFmpeg conversion):

sws_scale generates malformed video

Question

1 answers

solution1
1 2021-12-10 11:51:29

sws_scale generates malformed video

Question

1 answers

solution1 1 2021-12-10 11:51:29

solution1
1 2021-12-10 11:51:29