简体   繁体   中英

Transcode h264 video by MediaCodec directly withou texture render

I'm doing a transcoder using MediaCodec.

I created two mediacodec instance, one is for decoding and another is for encoding. I'm trying to send decoders outputBuffer directly into encoders inputBuffer.

It seems has no problem while compiling and executing.And it runs quickly. But the output video file has something wrong.I checked the metadata of the output video and they are all right : bitrate, framerate, resolution ...Only the images in the video is wrong like this: screen shot

I thought it has somethings wrong,but I cannot figure it out...

I searched libraries and documents, and I found some sample codes using Texture surface to render the decoder output data and tranfer the data into the encoder. But I thought it should not be neccessary for me. Because I dont need to edit images of the video.What I only need to do is changing the bitrate and resolution to make the file's size smaller.

here is the core code in my project:

private void decodeCore() {
    MediaCodec.BufferInfo bufferInfo = new MediaCodec.BufferInfo();
    int frameCount = 0;
    while (mDecodeRunning) {

        int inputBufferId = mDecoder.dequeueInputBuffer(50);
        if (inputBufferId >= 0) {
            // fill inputBuffers[inputBufferId] with valid data
            int sampleSize = mExtractor.readSampleData(mDecodeInputBuffers[inputBufferId], 0);
            if (sampleSize >= 0) {
                long time = mExtractor.getSampleTime();
                mDecoder.queueInputBuffer(inputBufferId, 0, sampleSize, time, 0);
            } else {
                mDecoder.queueInputBuffer(inputBufferId, 0, 0, 0, MediaCodec.BUFFER_FLAG_END_OF_STREAM);
            }

            mExtractor.advance();
        }

        int outputBufferId = mDecoder.dequeueOutputBuffer(bufferInfo, 50);
        if (outputBufferId >= 0) {

            FrameData data = mFrameDataQueue.obtain();
            //wait until queue has space to push data
            while (data == null) {
                try {
                    Thread.sleep(20);
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
                data = mFrameDataQueue.obtain();
            }

            data.data.clear();
            data.size = 0;
            data.offset = 0;
            data.flag = 0;
            data.frameTimeInUs = bufferInfo.presentationTimeUs;

            // outputBuffers[outputBufferId] is ready to be processed or rendered.
            if (bufferInfo.size > 0) {
                ByteBuffer buffer = mDecodeOutputBuffers[outputBufferId];

                buffer.position(bufferInfo.offset);
                buffer.limit(bufferInfo.offset + bufferInfo.size);

                data.data.put(buffer);
                data.data.flip();

                data.size = bufferInfo.size;
                data.frameIndex = frameCount++;

            }

            data.flag = bufferInfo.flags;

            if ((bufferInfo.flags & MediaCodec.BUFFER_FLAG_END_OF_STREAM) == MediaCodec.BUFFER_FLAG_END_OF_STREAM) {
                Log.d("bingbing_transcode", "decode over! frames:" + (frameCount - 1));
                mDecodeRunning = false;
            }

            mFrameDataQueue.pushToQueue(data);
            mDecoder.releaseOutputBuffer(outputBufferId, false);
            Log.d("bingbing_transcode", "decode output:\n frame:" + (frameCount - 1) + "\n" + "size:" + bufferInfo.size);
        } else if (outputBufferId == MediaCodec.INFO_OUTPUT_BUFFERS_CHANGED) {
            mDecodeOutputBuffers = mDecoder.getOutputBuffers();
        } else if (outputBufferId == MediaCodec.INFO_OUTPUT_FORMAT_CHANGED) {
            // Subsequent data will conform to new format.
            mDecodeOutputVideoFormat = mDecoder.getOutputFormat();
            configureAndStartEncoder();
        }
    }

    mDecoder.stop();
    mDecoder.release();
}

private void encodeCore() {
    int trackIndex = 0;
    boolean muxerStarted = false;
    MediaCodec.BufferInfo bufferInfo = new MediaCodec.BufferInfo();
    int frameCount = 0;
    while (mEncodeRunning) {
        int inputBufferId = mEncoder.dequeueInputBuffer(50);
        if (inputBufferId >= 0) {
            FrameData data = mFrameDataQueue.pollFromQueue();
            //wait until queue has space to push data
            while (data == null) {
                try {
                    Thread.sleep(20);
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
                data = mFrameDataQueue.obtain();
            }

            if (data.size > 0) {
                ByteBuffer inputBuffer = mEncodeInputBuffers[inputBufferId];
                inputBuffer.clear();
                inputBuffer.put(data.data);
                inputBuffer.flip();
            }
            mEncoder.queueInputBuffer(inputBufferId, 0, data.size, data.frameTimeInUs, data.flag);
            mFrameDataQueue.recycle(data);
        }

        int outputBufferId = mEncoder.dequeueOutputBuffer(bufferInfo, 50);
        if (outputBufferId >= 0) {
            // outputBuffers[outputBufferId] is ready to be processed or rendered.
            ByteBuffer encodedData = mEncodeOutputBuffers[outputBufferId];

            if (bufferInfo.size > 0) {
                if (encodedData == null) {
                    throw new RuntimeException("encoderOutputBuffer " + outputBufferId + " was null");
                }

                if (!muxerStarted) {
                    throw new RuntimeException("muxer hasn't started");
                }

                frameCount++;
            }
            // adjust the ByteBuffer values to match BufferInfo (not needed?)
            encodedData.position(bufferInfo.offset);
            encodedData.limit(bufferInfo.offset + bufferInfo.size);

            mMuxer.writeSampleData(trackIndex, encodedData, bufferInfo);

            if ((bufferInfo.flags & MediaCodec.BUFFER_FLAG_END_OF_STREAM) == MediaCodec.BUFFER_FLAG_END_OF_STREAM) {
                Log.d("bingbing_transcode", "encode over! frames:" + (frameCount - 1));
                mEncodeRunning = false;
            }

            mEncoder.releaseOutputBuffer(outputBufferId, false);
            Log.d("bingbing_transcode", "encode output:\n frame:" + (frameCount - 1) + "\n" + "size:" + bufferInfo.size);

        } else if (outputBufferId == MediaCodec.INFO_OUTPUT_BUFFERS_CHANGED) {
            mEncodeOutputBuffers = mEncoder.getOutputBuffers();
        } else if (outputBufferId == MediaCodec.INFO_OUTPUT_FORMAT_CHANGED) {
            // should happen before receiving buffers, and should only happen once
            if (muxerStarted) {
                throw new RuntimeException("format changed twice");
            }

            MediaFormat newFormat = mEncoder.getOutputFormat();
            Log.d("bingbing_transcode", "encoder output format changed: " + newFormat);

            // now that we have the Magic Goodies, start the muxer
            trackIndex = mMuxer.addTrack(newFormat);
            mMuxer.start();
            muxerStarted = true;
            mEncodeOutputVideoFormat = newFormat;
        }
    }


    mEncoder.stop();
    mEncoder.release();
    if (muxerStarted) {
        mMuxer.stop();
        mMuxer.release();
    }
}

these two functions run in two different threads.

FrameData is a simple storage of frame bytebuffer and frame present time and something needed

When using bytebuffer input, there are a few details that are undefined about the input data layout. When the width isn't a multiple of 16, some encoders want to have the input data row length padded to a multiple of 16, while others will assume a line length equal to the width, with no extra padding.

The Android CTS tests (which define what behaviour one can expect across all devices) for encoding from bytebuffer inputs intentionally only test resolutions that are a multiple of 16, since they know different hardware vendors do this differently, and they didn't want to enforce any particular handling.

You can't generally assume that the decoder output would use a similar row size as what the encoder consumes either. The decoder is free to (and some actually do) return a significantly larger width than the actual content size, and use the crop_left/crop_right fields for indicating what parts of it actually are intended to be visible. So in case the decoder did that, you can't pass the data straight from the decoder to the encoder unless you copy it line by line, taking into account the actual line sizes used by the decoder and encoder.

Additionally, you can't even assume that the decoder uses a similar pixel format as the encoder. Many qualcomm devices use a special tiled pixel format for the decoder output, while the encoder input is normal planar data. In these cases, you'd have to implement a pretty complex logic for unshuffling the data before you can feed it into the encoder.

Using a texture surface as intermediate hides all of these details. It might not sound completely necessary for your use case, but it does hide all the variation in buffer formats between decoder and encoder.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM