Android：使用 MediaCodec 编码音频和视频

Question

I'm trying to encode video from camera and audio from microphone using MediaCodec and MediaMuxer.我正在尝试使用 MediaCodec 和 MediaMuxer 对来自相机的视频和来自麦克风的音频进行编码。 I use OpenGL to overlay text on the image while recording.我在录制时使用 OpenGL 在图像上叠加文本。

I took these classes as example:我以这些课程为例：

I wrote a main class that performs the encoding.我写了一个执行编码的主类。 It spawns 2 threads for recording audio and video.它产生 2 个线程来录制音频和视频。 It does not work (the generated file is not valid), but if I comment one of the threads (either audio or video), it works ok.它不起作用（生成的文件无效），但是如果我评论其中一个线程（音频或视频），它就可以正常工作。 Also, I need to set TRACK_COUNT to 1. This is the code for the main class:另外，我需要将 TRACK_COUNT 设置为 1。这是主类的代码：

import android.graphics.SurfaceTexture;
import android.media.AudioFormat;
import android.media.AudioRecord;
import android.media.MediaCodec;
import android.media.MediaCodecInfo;
import android.media.MediaFormat;
import android.media.MediaMuxer;
import android.media.MediaRecorder;

import com.google.common.base.Throwables;

import java.io.IOException;
import java.nio.ByteBuffer;

import static com.google.common.base.Preconditions.checkNotNull;

/**
 * Class for recording a reply including a text message.
 */
public class ReplyRecorder {
    // Encoding state
    private boolean encoding;
    long startWhen;

    // Muxer
    private static final int TRACK_COUNT = 2;
    private Muxer mMuxer;

    // Video
    private static final String VIDEO_MIME_TYPE = "video/avc"; // H.264 Advanced Video Coding
    private static final int FRAME_RATE = 15;                  // 30fps
    private static final int IFRAME_INTERVAL = 10;             // 5 seconds between I-frames
    private static final int BIT_RATE = 2000000;

    private Encoder mVideoEncoder;
    private CodecInputSurface mInputSurface;

    private SurfaceTextureManager mStManager;

    // Audio
    private static final String AUDIO_MIME_TYPE = "audio/mp4a-latm";
    private static final int SAMPLE_RATE = 44100;
    private static final int SAMPLES_PER_FRAME = 1024;
    private static final int CHANNEL_CONFIG = AudioFormat.CHANNEL_IN_MONO;
    private static final int AUDIO_FORMAT = AudioFormat.ENCODING_PCM_16BIT;

    private Encoder mAudioEncoder;
    private AudioRecord audioRecord;

    public void start(final CameraManager cameraManager, final String messageText, final String filePath) {
        checkNotNull(cameraManager);
        checkNotNull(messageText);
        checkNotNull(filePath);

        try {
            // Create a MediaMuxer.  We can't add the video track and start() the muxer here,
            // because our MediaFormat doesn't have the Magic Goodies.  These can only be
            // obtained from the encoder after it has started processing data.
            mMuxer = new Muxer(new MediaMuxer(filePath, MediaMuxer.OutputFormat.MUXER_OUTPUT_MPEG_4), TRACK_COUNT);
            startWhen = System.nanoTime();
            encoding = true;
            new Thread(new Runnable() {
                @Override
                public void run() {
                    initVideoComponents(cameraManager, messageText);
                    encodeVideo(cameraManager);
                }
            }).start();
            new Thread(new Runnable() {
                @Override
                public void run() {
                    initAudioComponents();
                    encodeAudio();
                }
            }).start();
        } catch (IOException e) {
            release();
            throw Throwables.propagate(e);
        }
    }

    private void initVideoComponents(CameraManager cameraManager,
                                     String messageText) {
        try {
            MediaFormat format = MediaFormat.createVideoFormat(VIDEO_MIME_TYPE, cameraManager.getEncWidth(), cameraManager.getEncHeight());

            // Set some properties.  Failing to specify some of these can cause the MediaCodec
            // configure() call to throw an unhelpful exception.
            format.setInteger(MediaFormat.KEY_COLOR_FORMAT,
                    MediaCodecInfo.CodecCapabilities.COLOR_FormatSurface);
            format.setInteger(MediaFormat.KEY_BIT_RATE, BIT_RATE);
            format.setInteger(MediaFormat.KEY_FRAME_RATE, FRAME_RATE);
            format.setInteger(MediaFormat.KEY_I_FRAME_INTERVAL, IFRAME_INTERVAL);

            // Create a MediaCodec encoder, and configure it with our format.  Get a Surface
            // we can use for input and wrap it with a class that handles the EGL work.
            //
            // If you want to have two EGL contexts -- one for display, one for recording --
            // you will likely want to defer instantiation of CodecInputSurface until after the
            // "display" EGL context is created, then modify the eglCreateContext call to
            // take eglGetCurrentContext() as the share_context argument.
            mVideoEncoder = new Encoder(VIDEO_MIME_TYPE, format, mMuxer);
            mInputSurface = new CodecInputSurface(mVideoEncoder.getEncoder().createInputSurface());
            mVideoEncoder.getEncoder().start();

            mInputSurface.makeCurrent();
            mStManager = new SurfaceTextureManager(messageText, cameraManager.getEncWidth(), cameraManager.getEncHeight());
        } catch (RuntimeException e) {
            releaseVideo();
            throw e;
        }
    }

    private void encodeVideo(CameraManager cameraManager) {
        try {

            SurfaceTexture st = mStManager.getSurfaceTexture();
            cameraManager.record(st);

            while (encoding) {
                // Feed any pending encoder output into the muxer.
                mVideoEncoder.drain(false);

                // Acquire a new frame of input, and render it to the Surface.  If we had a
                // GLSurfaceView we could switch EGL contexts and call drawImage() a second
                // time to render it on screen.  The texture can be shared between contexts by
                // passing the GLSurfaceView's EGLContext as eglCreateContext()'s share_context
                // argument.
                mStManager.awaitNewImage();
                mStManager.drawImage();

                // Set the presentation time stamp from the SurfaceTexture's time stamp.  This
                // will be used by MediaMuxer to set the PTS in the video.
                mInputSurface.setPresentationTime(st.getTimestamp() - startWhen);

                // Submit it to the encoder.  The eglSwapBuffers call will block if the input
                // is full, which would be bad if it stayed full until we dequeued an output
                // buffer (which we can't do, since we're stuck here).  So long as we fully drain
                // the encoder before supplying additional input, the system guarantees that we
                // can supply another frame without blocking.
                mInputSurface.swapBuffers();
            }

            // send end-of-stream to encoder, and drain remaining output
            mVideoEncoder.drain(true);
        } finally {
            releaseVideo();
        }
    }

    private void initAudioComponents() {
        try {
            int min_buffer_size = AudioRecord.getMinBufferSize(SAMPLE_RATE, CHANNEL_CONFIG, AUDIO_FORMAT);
            int buffer_size = SAMPLES_PER_FRAME * 10;
            if (buffer_size < min_buffer_size)
                buffer_size = ((min_buffer_size / SAMPLES_PER_FRAME) + 1) * SAMPLES_PER_FRAME * 2;

            audioRecord = new AudioRecord(
                    MediaRecorder.AudioSource.MIC,       // source
                    SAMPLE_RATE,                         // sample rate, hz
                    CHANNEL_CONFIG,                      // channels
                    AUDIO_FORMAT,                        // audio format
                    buffer_size);                        // buffer size (bytes)

            /////////////////

            MediaFormat format = new MediaFormat();
            format.setString(MediaFormat.KEY_MIME, AUDIO_MIME_TYPE);
            format.setInteger(MediaFormat.KEY_AAC_PROFILE, MediaCodecInfo.CodecProfileLevel.AACObjectLC);
            format.setInteger(MediaFormat.KEY_SAMPLE_RATE, 44100);
            format.setInteger(MediaFormat.KEY_CHANNEL_COUNT, 1);
            format.setInteger(MediaFormat.KEY_BIT_RATE, 128000);
            format.setInteger(MediaFormat.KEY_MAX_INPUT_SIZE, 16384);

            mAudioEncoder = new Encoder(AUDIO_MIME_TYPE, format, mMuxer);
            mAudioEncoder.getEncoder().start();
        } catch (RuntimeException e) {
            releaseAudio();
            throw e;
        }
    }

    private void encodeAudio() {
        try {
            audioRecord.startRecording();
            while (encoding) {
                mAudioEncoder.drain(false);
                sendAudioToEncoder(false);
            }
            //TODO: Sending "false" because calling signalEndOfInputStream fails on this encoder
            mAudioEncoder.drain(false);
        } finally {
            releaseAudio();
        }
    }

    public void sendAudioToEncoder(boolean endOfStream) {
        // send current frame data to encoder
        ByteBuffer[] inputBuffers = mAudioEncoder.getEncoder().getInputBuffers();
        int inputBufferIndex = mAudioEncoder.getEncoder().dequeueInputBuffer(-1);
        if (inputBufferIndex >= 0) {
            ByteBuffer inputBuffer = inputBuffers[inputBufferIndex];
            inputBuffer.clear();
            long presentationTimeNs = System.nanoTime();
            int inputLength = audioRecord.read(inputBuffer, SAMPLES_PER_FRAME);
            presentationTimeNs -= (inputLength / SAMPLE_RATE) / 1000000000;

            long presentationTimeUs = (presentationTimeNs - startWhen) / 1000;
            if (endOfStream) {
                mAudioEncoder.getEncoder().queueInputBuffer(inputBufferIndex, 0, inputLength, presentationTimeUs, MediaCodec.BUFFER_FLAG_END_OF_STREAM);
            } else {
                mAudioEncoder.getEncoder().queueInputBuffer(inputBufferIndex, 0, inputLength, presentationTimeUs, 0);
            }
        }
    }

    public void stop() {
        encoding = false;
    }

    /**
     * Releases encoder resources.
     */
    public void release() {
        releaseVideo();
        releaseAudio();
    }

    private void releaseVideo() {
        if (mVideoEncoder != null) {
            mVideoEncoder.release();
            mVideoEncoder = null;
        }
        if (mInputSurface != null) {
            mInputSurface.release();
            mInputSurface = null;
        }
        if (mStManager != null) {
            mStManager.release();
            mStManager = null;
        }
        releaseMuxer();
    }

    private void releaseAudio() {
        if (audioRecord != null) {
            audioRecord.stop();
            audioRecord = null;
        }
        if (mAudioEncoder != null) {
            mAudioEncoder.release();
            mAudioEncoder = null;
        }
        releaseMuxer();
    }

    private void releaseMuxer() {
        if (mMuxer != null && mVideoEncoder == null && mAudioEncoder == null) {
            mMuxer.release();
            mMuxer = null;
        }
    }

    public boolean isRecording() {
        return mMuxer != null;
    }
}

The class that wraps the muxer and waits for tracks to be completed before starting is the following (I added some synchronized just to test):包装多路复用器并在开始之前等待轨道完成的类如下（我添加了一些同步只是为了测试）：

import android.media.MediaCodec;
import android.media.MediaFormat;
import android.media.MediaMuxer;

import com.google.common.base.Throwables;

import java.nio.ByteBuffer;

import static com.google.common.base.Preconditions.checkNotNull;
import static com.google.common.base.Preconditions.checkState;

/**
 * Class responsible for muxing. Wraps a MediaMuxer.
 */
public class Muxer {
    private final MediaMuxer muxer;
    private final int totalTracks;
    private int trackCounter;

    public Muxer(MediaMuxer muxer, int totalTracks) {
        this.muxer = checkNotNull(muxer);
        this.totalTracks = totalTracks;
    }

    synchronized public int addTrack(MediaFormat format) {
        checkState(!isStarted(), "Muxer already started");
        int trackIndex = muxer.addTrack(format);
        trackCounter++;
        if (isStarted()) {
            muxer.start();
            notifyAll();
        } else {
            while (!isStarted()) {
                try {
                    wait();
                } catch (InterruptedException e) {
                    Throwables.propagate(e);
                }
            }
        }
        return trackIndex;
    }

    synchronized public void writeSampleData(int trackIndex, ByteBuffer byteBuf,
                                MediaCodec.BufferInfo bufferInfo) {
        checkState(isStarted(), "Muxer not started");
        muxer.writeSampleData(trackIndex, byteBuf, bufferInfo);
    }

    public void release() {
        if (muxer != null) {
            try {
                muxer.stop();
            } catch (Exception e) {
            }
            muxer.release();
        }
    }

    private boolean isStarted() {
        return trackCounter == totalTracks;
    }
}

And the class responsible for writing to MediaCodec encoder is the following:负责写入 MediaCodec 编码器的类如下：

import android.media.MediaCodec;
import android.media.MediaFormat;

import com.google.common.base.Throwables;

import java.io.IOException;
import java.nio.ByteBuffer;

import static com.google.common.base.Preconditions.checkNotNull;
import static com.google.common.base.Preconditions.checkState;

/**
 * Class responsible for encoding.
 */
public class Encoder {
    private final MediaCodec encoder;
    private final Muxer muxer;
    private final MediaCodec.BufferInfo bufferInfo;
    private int trackIndex;


    public Encoder(String mimeType, MediaFormat format, Muxer muxer) {
        checkNotNull(mimeType);
        checkNotNull(format);
        checkNotNull(muxer);

        try {
            encoder = MediaCodec.createEncoderByType(mimeType);
            encoder.configure(format, null, null, MediaCodec.CONFIGURE_FLAG_ENCODE);

            this.muxer = muxer;
            bufferInfo = new MediaCodec.BufferInfo();
        } catch (IOException e) {
            throw Throwables.propagate(e);
        }
    }

    public MediaCodec getEncoder() {
        return encoder;
    }

    /**
     * Extracts all pending data from the encoder and forwards it to the muxer.
     * <p/>
     * If endOfStream is not set, this returns when there is no more data to drain.  If it
     * is set, we send EOS to the encoder, and then iterate until we see EOS on the output.
     * Calling this with endOfStream set should be done once, right before stopping the muxer.
     * <p/>
     * We're just using the muxer to get a .mp4 file (instead of a raw H.264 stream).
     */
    public void drain(boolean endOfStream) {
        final int TIMEOUT_USEC = 10000;

        if (endOfStream) {
            encoder.signalEndOfInputStream();
        }

        ByteBuffer[] encoderOutputBuffers = encoder.getOutputBuffers();
        while (true) {
            int encoderStatus = encoder.dequeueOutputBuffer(bufferInfo, TIMEOUT_USEC);
            if (encoderStatus == MediaCodec.INFO_TRY_AGAIN_LATER) {
                // no output available yet
                if (!endOfStream) {
                    break;      // out of while
                }
            } else if (encoderStatus == MediaCodec.INFO_OUTPUT_BUFFERS_CHANGED) {
                // not expected for an encoder
                encoderOutputBuffers = encoder.getOutputBuffers();
            } else if (encoderStatus == MediaCodec.INFO_OUTPUT_FORMAT_CHANGED) {
                // now that we have the Magic Goodies, start the muxer
                trackIndex = muxer.addTrack(encoder.getOutputFormat());
            } else if (encoderStatus < 0) {
                // let's ignore it
            } else {
                ByteBuffer encodedData = encoderOutputBuffers[encoderStatus];
                checkState(encodedData != null, "encoderOutputBuffer %s was null", encoderStatus);

                if ((bufferInfo.flags & MediaCodec.BUFFER_FLAG_CODEC_CONFIG) != 0) {
                    // The codec config data was pulled out and fed to the muxer when we got
                    // the INFO_OUTPUT_FORMAT_CHANGED status.  Ignore it.
                    bufferInfo.size = 0;
                }

                if (bufferInfo.size != 0) {
                    // adjust the ByteBuffer values to match BufferInfo (not needed?)
                    encodedData.position(bufferInfo.offset);
                    encodedData.limit(bufferInfo.offset + bufferInfo.size);

                    muxer.writeSampleData(trackIndex, encodedData, bufferInfo);
                }

                encoder.releaseOutputBuffer(encoderStatus, false);

                if ((bufferInfo.flags & MediaCodec.BUFFER_FLAG_END_OF_STREAM) != 0) {
                    break;      // out of while
                }
            }
        }
    }

    public void release() {
        if (encoder != null) {
            try {
                encoder.stop();
            } catch (Exception e) {
            }
            encoder.release();
        }
    }
}

Any idea why it could be failing when running concurrently?知道为什么并发运行时可能会失败吗？

Answer 1

Ok, I have achieved the ultimate goal of the original poster finally.好了，终于达到了原版海报的最终目的。 The issue was as I expected and it had to do the with timestamps that were being generated for the Audio track not matching exactly what the video track was giving us.问题正如我所料，它必须处理为音频轨道生成的时间戳与视频轨道给我们的内容不完全匹配。

My solution was to pass the Surface time stamp that our VideoEncoder was using to store in its BufferInfo into the AudioEncoder as well.我的解决方案是将我们的VideoEncoder用来存储在其BufferInfo的 Surface 时间戳也BufferInfo到AudioEncoder中。 Instead of calculating that time stamp based on running time of the thread which the Original poster was doing.而不是根据原始海报所做的线程的运行时间来计算时间戳。 I just took the Timestamp from the surface and used that as my AudioEncoder BufferInfo time stamp.我只是从表面获取时间戳并将其用作我的AudioEncoder BufferInfo时间戳。 You must make sure that your buffer limit for you audio recorder is set large enough to handle that since we will not be receiving the audio frames at the sample rate but instead the frame rate of the video.您必须确保您的录音机的缓冲区限制设置得足够大以处理该问题，因为我们不会以采样率接收音频帧，而是以视频的帧速率接收。 This is trivial to figure out.这很容易弄清楚。

To be clear, the Audio and the Video encoding still take place on separate threads, but whenever I make a call to mVideoEncoder.onFrameAvailable to send a message to the video encoder thread with the time stamp of the surface.需要明确的是，音频和视频编码仍然在不同的线程上进行，但是每当我调用mVideoEncoder.onFrameAvailable向视频编码器线程发送带有表面时间戳的消息时。 I do the same thing for the AudioEncoder thread with the TimeStamp of the surface texture that we use for video encoding.This has the desired result of a completely functional MP4 video with both audio and video tracks without the stuttering that was happening originally.我用我们用于视频编码的表面纹理的时间戳对AudioEncoder线程做同样的事情。这具有完整功能的 MP4 视频的预期结果，具有音频和视频轨道，而没有最初发生的口吃。 I hope this helps anyone that is currently having a similar problem or has in the past.我希望这可以帮助目前遇到类似问题或过去遇到过类似问题的任何人。

Answer 2

Issue must be that you call muxer.addTrack(encoder.getOutputFormat()) when other thread already have started to write sample data ( ..muxer.writeSampleData(trackIndex, encodedData, bufferInfo) ).问题必须是当其他线程已经开始写入样本数据（ ..muxer.writeSampleData(trackIndex, encodedData, bufferInfo) ）时调用muxer.addTrack(encoder.getOutputFormat()) ..muxer.writeSampleData(trackIndex, encodedData, bufferInfo) ) 。 This causes IllegalStateException in MediaMuxer, but you are not catching it, just calling releaseAudio() in finally section.这会导致 MediaMuxer 中的 IllegalStateException，但您没有捕获它，只是在finally部分调用releaseAudio() 。

You should try to synchronise threads.您应该尝试同步线程。 Wait for both threads calls muxer.addTrack(encoder.getOutputFormat()) and then allow threads writing samples by muxer.writeSampleData(trackIndex, encodedData, bufferInfo) .等待两个线程调用muxer.addTrack(encoder.getOutputFormat()) ，然后允许线程通过muxer.writeSampleData(trackIndex, encodedData, bufferInfo)写入样本。
Or run audio encoding in the same thread as video encoding.或者在与视频编码相同的线程中运行音频编码。

Android：使用 MediaCodec 编码音频和视频

问题描述

2 个解决方案

解决方案1
5 2016-01-25 14:27:54

解决方案2
0 2018-11-28 09:21:20

Android：使用 MediaCodec 编码音频和视频

问题描述

2 个解决方案

解决方案1 5 2016-01-25 14:27:54

解决方案2 0 2018-11-28 09:21:20

解决方案1
5 2016-01-25 14:27:54

解决方案2
0 2018-11-28 09:21:20