简体   繁体   English

如何使用x264 C API将一系列图像编码为H264?

[英]How does one encode a series of images into H264 using the x264 C API?

How does one use the x264 C API to encode RBG images into H264 frames? 如何使用x264 C API将RBG图像编码为H264帧? I already created a sequence of RBG images, how can I now transform that sequence into a sequence of H264 frames? 我已经创建了一系列RBG图像,我现在如何将该序列转换为H264帧序列? In particular, how do I encode this sequence of RGB images into a sequence of H264 frame consisting of a single initial H264 keyframe followed by dependent H264 frames? 特别是,如何将这个RGB图像序列编码为H264帧序列,该帧由单个初始H264关键帧后跟依赖H264帧组成?

First of all: check the x264.h file, it contains more or less the reference for each function and structure. 首先:检查x264.h文件,它包含或多或少的每个函数和结构的引用。 The x264.c file you can find in the download contains a sample implementation. 您可以在下载中找到的x264.c文件包含一个示例实现。 Most people say to base yourself on that one, but I find it rather complex for beginners, it is good as an example to fall back on however. 大多数人都说基于那个,但我觉得它对于初学者来说相当复杂,但是它可以作为一个例子来回归。

First you set up some parameters, of the type x264_param_t, a good site describing parameters is http://mewiki.project357.com/wiki/X264_Settings . 首先设置一些x264_param_t类型的参数,一个描述参数的好站点是http://mewiki.project357.com/wiki/X264_Settings Also take a look at the x264_param_default_preset function which allows you to target some functionality without needing to understand all of the (sometimes quite complex) parameters. 另请参阅x264_param_default_preset函数,该函数允许您定位某些功能,而无需了解所有(有时非常复杂的)参数。 Also use x264_param_apply_profile afterwards (you'll probably want the "baseline" profile) 之后也使用x264_param_apply_profile (你可能想要“基线”配置文件)

This is some example setup from my code: 这是我的代码中的一些示例设置:

x264_param_t param;
x264_param_default_preset(&param, "veryfast", "zerolatency");
param.i_threads = 1;
param.i_width = width;
param.i_height = height;
param.i_fps_num = fps;
param.i_fps_den = 1;
// Intra refres:
param.i_keyint_max = fps;
param.b_intra_refresh = 1;
//Rate control:
param.rc.i_rc_method = X264_RC_CRF;
param.rc.f_rf_constant = 25;
param.rc.f_rf_constant_max = 35;
//For streaming:
param.b_repeat_headers = 1;
param.b_annexb = 1;
x264_param_apply_profile(&param, "baseline");

After this you can initialize the encoder as follows 在此之后,您可以按如下方式初始化编码器

x264_t* encoder = x264_encoder_open(&param);
x264_picture_t pic_in, pic_out;
x264_picture_alloc(&pic_in, X264_CSP_I420, w, h)

X264 expects YUV420P data (I guess some others also, but that's the common one). X264期待YUV420P数据(我猜其他一些数据,但这是常见的数据)。 You can use libswscale (from ffmpeg) to convert images to the right format. 您可以使用libswscale(来自ffmpeg)将图像转换为正确的格式。 Initializing this is like this (i assume RGB data with 24bpp). 初始化是这样的(我假设RGB数据为24bpp)。

struct SwsContext* convertCtx = sws_getContext(in_w, in_h, PIX_FMT_RGB24, out_w, out_h, PIX_FMT_YUV420P, SWS_FAST_BILINEAR, NULL, NULL, NULL);

encoding is as simple as this then, for each frame do: 编码就像这样简单,对于每一帧做:

//data is a pointer to you RGB structure
int srcstride = w*3; //RGB stride is just 3*width
sws_scale(convertCtx, &data, &srcstride, 0, h, pic_in.img.plane, pic_in.img.stride);
x264_nal_t* nals;
int i_nals;
int frame_size = x264_encoder_encode(encoder, &nals, &i_nals, &pic_in, &pic_out);
if (frame_size >= 0)
{
    // OK
}

I hope this will get you going ;), I spent a long time on it myself to get started. 我希望这会让你前进;),我自己花了很长时间才开始。 X264 is an insanely strong but sometimes complex piece of software. X264是一款非常强大但有时很复杂的软件。

edit: When you use other parameters there will be delayed frames, this is not the case with my parameters (mostly due to the nolatency option). 编辑:使用其他参数时会出现延迟帧,但我的参数不是这种情况(主要是由于nolatency选项)。 If this is the case, frame_size will sometimes be zero and you'll have to call x264_encoder_encode as long as the function x264_encoder_delayed_frames does not return 0. But for this functionality you should take a deeper peek into x264.c and x264.h . 如果是这种情况,frame_size有时会为零,只要函数x264_encoder_delayed_frames不返回0,您就必须调用x264_encoder_encode但是对于此功能,您应该更深入地了解x264.c和x264.h。

I've uploaded an example which generates raw yuv frames and then encodes them using x264. 我上传了一个生成原始yuv帧的示例,然后使用x264对它们进行编码。 Full code can be found here: https://gist.github.com/roxlu/6453908 完整代码可以在这里找到: https//gist.github.com/roxlu/6453908

FFmpeg 2.8.6 C runnable example FFmpeg 2.8.6 C可运行的例子

Using FFpmeg as a wrapper for x264 is a good idea, as it exposes an uniform API for multiple encoders. 使用FFpmeg作为x264的包装是一个好主意,因为它为多个编码器公开了一个统一的API。 So if you ever need to change formats, you can change just one parameter instead of learning a new API. 因此,如果您需要更改格式,则只需更改一个参数,而不是学习新的API。

The example synthesizes and encodes some colorful frames generated by generate_rgb . 该示例合成并编码由generate_rgb一些彩色帧。

Control of frame type ( I, P, B ) to have as few key-frames as possible (ideally just the first) is discussed here: https://stackoverflow.com/a/36412909/895245 As mentioned there, I do not recommend it for most applications. 控制帧类型( I,P,B )以尽可能少的关键帧(理想情况下只是第一个)在这里讨论: https//stackoverflow.com/a/36412909/895245如前所述,我不推荐它适用于大多数应用。

The key-lines that do frame type control here are: 这里进行帧类型控制的关键线是:

/* Minimal distance of I-frames. This is the maximum value allowed,
or else we get a warning at runtime. */
c->keyint_min = 600;

and: 和:

if (frame->pts == 1) {
    frame->key_frame = 1;
    frame->pict_type = AV_PICTURE_TYPE_I;
} else {
    frame->key_frame = 0;
    frame->pict_type = AV_PICTURE_TYPE_P;
}

We can then verify the frame type with: 然后我们可以用以下方法验证帧类型:

ffprobe -select_streams v \
    -show_frames \
    -show_entries frame=pict_type \
    -of csv \
    tmp.h264

as mentioned at: https://superuser.com/questions/885452/extracting-the-index-of-key-frames-from-a-video-using-ffmpeg 如上所述: https//superuser.com/questions/885452/extracting-the-index-of-key-frames-from-a-video-using-ffmpeg

Preview of generated output . 生成输出的预览

main.c main.c中

#include <libavcodec/avcodec.h>
#include <libavutil/imgutils.h>
#include <libavutil/opt.h>
#include <libswscale/swscale.h>

static AVCodecContext *c = NULL;
static AVFrame *frame;
static AVPacket pkt;
static FILE *file;
struct SwsContext *sws_context = NULL;

static void ffmpeg_encoder_set_frame_yuv_from_rgb(uint8_t *rgb) {
    const int in_linesize[1] = { 3 * c->width };
    sws_context = sws_getCachedContext(sws_context,
            c->width, c->height, AV_PIX_FMT_RGB24,
            c->width, c->height, AV_PIX_FMT_YUV420P,
            0, 0, 0, 0);
    sws_scale(sws_context, (const uint8_t * const *)&rgb, in_linesize, 0,
            c->height, frame->data, frame->linesize);
}

uint8_t* generate_rgb(int width, int height, int pts, uint8_t *rgb) {
    int x, y, cur;
    rgb = realloc(rgb, 3 * sizeof(uint8_t) * height * width);
    for (y = 0; y < height; y++) {
        for (x = 0; x < width; x++) {
            cur = 3 * (y * width + x);
            rgb[cur + 0] = 0;
            rgb[cur + 1] = 0;
            rgb[cur + 2] = 0;
            if ((frame->pts / 25) % 2 == 0) {
                if (y < height / 2) {
                    if (x < width / 2) {
                        /* Black. */
                    } else {
                        rgb[cur + 0] = 255;
                    }
                } else {
                    if (x < width / 2) {
                        rgb[cur + 1] = 255;
                    } else {
                        rgb[cur + 2] = 255;
                    }
                }
            } else {
                if (y < height / 2) {
                    rgb[cur + 0] = 255;
                    if (x < width / 2) {
                        rgb[cur + 1] = 255;
                    } else {
                        rgb[cur + 2] = 255;
                    }
                } else {
                    if (x < width / 2) {
                        rgb[cur + 1] = 255;
                        rgb[cur + 2] = 255;
                    } else {
                        rgb[cur + 0] = 255;
                        rgb[cur + 1] = 255;
                        rgb[cur + 2] = 255;
                    }
                }
            }
        }
    }
    return rgb;
}

/* Allocate resources and write header data to the output file. */
void ffmpeg_encoder_start(const char *filename, int codec_id, int fps, int width, int height) {
    AVCodec *codec;
    int ret;

    codec = avcodec_find_encoder(codec_id);
    if (!codec) {
        fprintf(stderr, "Codec not found\n");
        exit(1);
    }
    c = avcodec_alloc_context3(codec);
    if (!c) {
        fprintf(stderr, "Could not allocate video codec context\n");
        exit(1);
    }
    c->bit_rate = 400000;
    c->width = width;
    c->height = height;
    c->time_base.num = 1;
    c->time_base.den = fps;
    c->keyint_min = 600;
    c->pix_fmt = AV_PIX_FMT_YUV420P;
    if (codec_id == AV_CODEC_ID_H264)
        av_opt_set(c->priv_data, "preset", "slow", 0);
    if (avcodec_open2(c, codec, NULL) < 0) {
        fprintf(stderr, "Could not open codec\n");
        exit(1);
    }
    file = fopen(filename, "wb");
    if (!file) {
        fprintf(stderr, "Could not open %s\n", filename);
        exit(1);
    }
    frame = av_frame_alloc();
    if (!frame) {
        fprintf(stderr, "Could not allocate video frame\n");
        exit(1);
    }
    frame->format = c->pix_fmt;
    frame->width  = c->width;
    frame->height = c->height;
    ret = av_image_alloc(frame->data, frame->linesize, c->width, c->height, c->pix_fmt, 32);
    if (ret < 0) {
        fprintf(stderr, "Could not allocate raw picture buffer\n");
        exit(1);
    }
}

/*
Write trailing data to the output file
and free resources allocated by ffmpeg_encoder_start.
*/
void ffmpeg_encoder_finish(void) {
    uint8_t endcode[] = { 0, 0, 1, 0xb7 };
    int got_output, ret;
    do {
        fflush(stdout);
        ret = avcodec_encode_video2(c, &pkt, NULL, &got_output);
        if (ret < 0) {
            fprintf(stderr, "Error encoding frame\n");
            exit(1);
        }
        if (got_output) {
            fwrite(pkt.data, 1, pkt.size, file);
            av_packet_unref(&pkt);
        }
    } while (got_output);
    fwrite(endcode, 1, sizeof(endcode), file);
    fclose(file);
    avcodec_close(c);
    av_free(c);
    av_freep(&frame->data[0]);
    av_frame_free(&frame);
}

/*
Encode one frame from an RGB24 input and save it to the output file.
Must be called after ffmpeg_encoder_start, and ffmpeg_encoder_finish
must be called after the last call to this function.
*/
void ffmpeg_encoder_encode_frame(uint8_t *rgb) {
    int ret, got_output;
    ffmpeg_encoder_set_frame_yuv_from_rgb(rgb);
    av_init_packet(&pkt);
    pkt.data = NULL;
    pkt.size = 0;
    if (frame->pts == 1) {
        frame->key_frame = 1;
        frame->pict_type = AV_PICTURE_TYPE_I;
    } else {
        frame->key_frame = 0;
        frame->pict_type = AV_PICTURE_TYPE_P;
    }
    ret = avcodec_encode_video2(c, &pkt, frame, &got_output);
    if (ret < 0) {
        fprintf(stderr, "Error encoding frame\n");
        exit(1);
    }
    if (got_output) {
        fwrite(pkt.data, 1, pkt.size, file);
        av_packet_unref(&pkt);
    }
}

/* Represents the main loop of an application which generates one frame per loop. */
static void encode_example(const char *filename, int codec_id) {
    int pts;
    int width = 320;
    int height = 240;
    uint8_t *rgb = NULL;
    ffmpeg_encoder_start(filename, codec_id, 25, width, height);
    for (pts = 0; pts < 100; pts++) {
        frame->pts = pts;
        rgb = generate_rgb(width, height, pts, rgb);
        ffmpeg_encoder_encode_frame(rgb);
    }
    ffmpeg_encoder_finish();
}

int main(void) {
    avcodec_register_all();
    encode_example("tmp.h264", AV_CODEC_ID_H264);
    encode_example("tmp.mpg", AV_CODEC_ID_MPEG1VIDEO);
    return 0;
}

Compile and run with: 编译并运行:

gcc -o main.out -std=c99 -Wextra main.c -lavcodec -lswscale -lavutil
./main.out
ffplay tmp.mpg
ffplay tmp.h264

Tested on Ubuntu 16.04. 在Ubuntu 16.04上测试过。 GitHub upstream . GitHub上游

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM