简体   繁体   中英

Android: Encoding audio and video using MediaCodec

I'm trying to encode video from camera and audio from microphone using MediaCodec and MediaMuxer. I use OpenGL to overlay text on the image while recording.

I took these classes as example:

I wrote a main class that performs the encoding. It spawns 2 threads for recording audio and video. It does not work (the generated file is not valid), but if I comment one of the threads (either audio or video), it works ok. Also, I need to set TRACK_COUNT to 1. This is the code for the main class:

import android.graphics.SurfaceTexture;
import android.media.AudioFormat;
import android.media.AudioRecord;
import android.media.MediaCodec;
import android.media.MediaCodecInfo;
import android.media.MediaFormat;
import android.media.MediaMuxer;
import android.media.MediaRecorder;

import com.google.common.base.Throwables;

import java.io.IOException;
import java.nio.ByteBuffer;

import static com.google.common.base.Preconditions.checkNotNull;

/**
 * Class for recording a reply including a text message.
 */
public class ReplyRecorder {
    // Encoding state
    private boolean encoding;
    long startWhen;

    // Muxer
    private static final int TRACK_COUNT = 2;
    private Muxer mMuxer;

    // Video
    private static final String VIDEO_MIME_TYPE = "video/avc"; // H.264 Advanced Video Coding
    private static final int FRAME_RATE = 15;                  // 30fps
    private static final int IFRAME_INTERVAL = 10;             // 5 seconds between I-frames
    private static final int BIT_RATE = 2000000;

    private Encoder mVideoEncoder;
    private CodecInputSurface mInputSurface;

    private SurfaceTextureManager mStManager;

    // Audio
    private static final String AUDIO_MIME_TYPE = "audio/mp4a-latm";
    private static final int SAMPLE_RATE = 44100;
    private static final int SAMPLES_PER_FRAME = 1024;
    private static final int CHANNEL_CONFIG = AudioFormat.CHANNEL_IN_MONO;
    private static final int AUDIO_FORMAT = AudioFormat.ENCODING_PCM_16BIT;

    private Encoder mAudioEncoder;
    private AudioRecord audioRecord;

    public void start(final CameraManager cameraManager, final String messageText, final String filePath) {
        checkNotNull(cameraManager);
        checkNotNull(messageText);
        checkNotNull(filePath);

        try {
            // Create a MediaMuxer.  We can't add the video track and start() the muxer here,
            // because our MediaFormat doesn't have the Magic Goodies.  These can only be
            // obtained from the encoder after it has started processing data.
            mMuxer = new Muxer(new MediaMuxer(filePath, MediaMuxer.OutputFormat.MUXER_OUTPUT_MPEG_4), TRACK_COUNT);
            startWhen = System.nanoTime();
            encoding = true;
            new Thread(new Runnable() {
                @Override
                public void run() {
                    initVideoComponents(cameraManager, messageText);
                    encodeVideo(cameraManager);
                }
            }).start();
            new Thread(new Runnable() {
                @Override
                public void run() {
                    initAudioComponents();
                    encodeAudio();
                }
            }).start();
        } catch (IOException e) {
            release();
            throw Throwables.propagate(e);
        }
    }

    private void initVideoComponents(CameraManager cameraManager,
                                     String messageText) {
        try {
            MediaFormat format = MediaFormat.createVideoFormat(VIDEO_MIME_TYPE, cameraManager.getEncWidth(), cameraManager.getEncHeight());

            // Set some properties.  Failing to specify some of these can cause the MediaCodec
            // configure() call to throw an unhelpful exception.
            format.setInteger(MediaFormat.KEY_COLOR_FORMAT,
                    MediaCodecInfo.CodecCapabilities.COLOR_FormatSurface);
            format.setInteger(MediaFormat.KEY_BIT_RATE, BIT_RATE);
            format.setInteger(MediaFormat.KEY_FRAME_RATE, FRAME_RATE);
            format.setInteger(MediaFormat.KEY_I_FRAME_INTERVAL, IFRAME_INTERVAL);

            // Create a MediaCodec encoder, and configure it with our format.  Get a Surface
            // we can use for input and wrap it with a class that handles the EGL work.
            //
            // If you want to have two EGL contexts -- one for display, one for recording --
            // you will likely want to defer instantiation of CodecInputSurface until after the
            // "display" EGL context is created, then modify the eglCreateContext call to
            // take eglGetCurrentContext() as the share_context argument.
            mVideoEncoder = new Encoder(VIDEO_MIME_TYPE, format, mMuxer);
            mInputSurface = new CodecInputSurface(mVideoEncoder.getEncoder().createInputSurface());
            mVideoEncoder.getEncoder().start();

            mInputSurface.makeCurrent();
            mStManager = new SurfaceTextureManager(messageText, cameraManager.getEncWidth(), cameraManager.getEncHeight());
        } catch (RuntimeException e) {
            releaseVideo();
            throw e;
        }
    }

    private void encodeVideo(CameraManager cameraManager) {
        try {

            SurfaceTexture st = mStManager.getSurfaceTexture();
            cameraManager.record(st);

            while (encoding) {
                // Feed any pending encoder output into the muxer.
                mVideoEncoder.drain(false);

                // Acquire a new frame of input, and render it to the Surface.  If we had a
                // GLSurfaceView we could switch EGL contexts and call drawImage() a second
                // time to render it on screen.  The texture can be shared between contexts by
                // passing the GLSurfaceView's EGLContext as eglCreateContext()'s share_context
                // argument.
                mStManager.awaitNewImage();
                mStManager.drawImage();

                // Set the presentation time stamp from the SurfaceTexture's time stamp.  This
                // will be used by MediaMuxer to set the PTS in the video.
                mInputSurface.setPresentationTime(st.getTimestamp() - startWhen);

                // Submit it to the encoder.  The eglSwapBuffers call will block if the input
                // is full, which would be bad if it stayed full until we dequeued an output
                // buffer (which we can't do, since we're stuck here).  So long as we fully drain
                // the encoder before supplying additional input, the system guarantees that we
                // can supply another frame without blocking.
                mInputSurface.swapBuffers();
            }

            // send end-of-stream to encoder, and drain remaining output
            mVideoEncoder.drain(true);
        } finally {
            releaseVideo();
        }
    }

    private void initAudioComponents() {
        try {
            int min_buffer_size = AudioRecord.getMinBufferSize(SAMPLE_RATE, CHANNEL_CONFIG, AUDIO_FORMAT);
            int buffer_size = SAMPLES_PER_FRAME * 10;
            if (buffer_size < min_buffer_size)
                buffer_size = ((min_buffer_size / SAMPLES_PER_FRAME) + 1) * SAMPLES_PER_FRAME * 2;

            audioRecord = new AudioRecord(
                    MediaRecorder.AudioSource.MIC,       // source
                    SAMPLE_RATE,                         // sample rate, hz
                    CHANNEL_CONFIG,                      // channels
                    AUDIO_FORMAT,                        // audio format
                    buffer_size);                        // buffer size (bytes)

            /////////////////

            MediaFormat format = new MediaFormat();
            format.setString(MediaFormat.KEY_MIME, AUDIO_MIME_TYPE);
            format.setInteger(MediaFormat.KEY_AAC_PROFILE, MediaCodecInfo.CodecProfileLevel.AACObjectLC);
            format.setInteger(MediaFormat.KEY_SAMPLE_RATE, 44100);
            format.setInteger(MediaFormat.KEY_CHANNEL_COUNT, 1);
            format.setInteger(MediaFormat.KEY_BIT_RATE, 128000);
            format.setInteger(MediaFormat.KEY_MAX_INPUT_SIZE, 16384);

            mAudioEncoder = new Encoder(AUDIO_MIME_TYPE, format, mMuxer);
            mAudioEncoder.getEncoder().start();
        } catch (RuntimeException e) {
            releaseAudio();
            throw e;
        }
    }

    private void encodeAudio() {
        try {
            audioRecord.startRecording();
            while (encoding) {
                mAudioEncoder.drain(false);
                sendAudioToEncoder(false);
            }
            //TODO: Sending "false" because calling signalEndOfInputStream fails on this encoder
            mAudioEncoder.drain(false);
        } finally {
            releaseAudio();
        }
    }

    public void sendAudioToEncoder(boolean endOfStream) {
        // send current frame data to encoder
        ByteBuffer[] inputBuffers = mAudioEncoder.getEncoder().getInputBuffers();
        int inputBufferIndex = mAudioEncoder.getEncoder().dequeueInputBuffer(-1);
        if (inputBufferIndex >= 0) {
            ByteBuffer inputBuffer = inputBuffers[inputBufferIndex];
            inputBuffer.clear();
            long presentationTimeNs = System.nanoTime();
            int inputLength = audioRecord.read(inputBuffer, SAMPLES_PER_FRAME);
            presentationTimeNs -= (inputLength / SAMPLE_RATE) / 1000000000;

            long presentationTimeUs = (presentationTimeNs - startWhen) / 1000;
            if (endOfStream) {
                mAudioEncoder.getEncoder().queueInputBuffer(inputBufferIndex, 0, inputLength, presentationTimeUs, MediaCodec.BUFFER_FLAG_END_OF_STREAM);
            } else {
                mAudioEncoder.getEncoder().queueInputBuffer(inputBufferIndex, 0, inputLength, presentationTimeUs, 0);
            }
        }
    }

    public void stop() {
        encoding = false;
    }

    /**
     * Releases encoder resources.
     */
    public void release() {
        releaseVideo();
        releaseAudio();
    }

    private void releaseVideo() {
        if (mVideoEncoder != null) {
            mVideoEncoder.release();
            mVideoEncoder = null;
        }
        if (mInputSurface != null) {
            mInputSurface.release();
            mInputSurface = null;
        }
        if (mStManager != null) {
            mStManager.release();
            mStManager = null;
        }
        releaseMuxer();
    }

    private void releaseAudio() {
        if (audioRecord != null) {
            audioRecord.stop();
            audioRecord = null;
        }
        if (mAudioEncoder != null) {
            mAudioEncoder.release();
            mAudioEncoder = null;
        }
        releaseMuxer();
    }

    private void releaseMuxer() {
        if (mMuxer != null && mVideoEncoder == null && mAudioEncoder == null) {
            mMuxer.release();
            mMuxer = null;
        }
    }

    public boolean isRecording() {
        return mMuxer != null;
    }
}

The class that wraps the muxer and waits for tracks to be completed before starting is the following (I added some synchronized just to test):

import android.media.MediaCodec;
import android.media.MediaFormat;
import android.media.MediaMuxer;

import com.google.common.base.Throwables;

import java.nio.ByteBuffer;

import static com.google.common.base.Preconditions.checkNotNull;
import static com.google.common.base.Preconditions.checkState;

/**
 * Class responsible for muxing. Wraps a MediaMuxer.
 */
public class Muxer {
    private final MediaMuxer muxer;
    private final int totalTracks;
    private int trackCounter;

    public Muxer(MediaMuxer muxer, int totalTracks) {
        this.muxer = checkNotNull(muxer);
        this.totalTracks = totalTracks;
    }

    synchronized public int addTrack(MediaFormat format) {
        checkState(!isStarted(), "Muxer already started");
        int trackIndex = muxer.addTrack(format);
        trackCounter++;
        if (isStarted()) {
            muxer.start();
            notifyAll();
        } else {
            while (!isStarted()) {
                try {
                    wait();
                } catch (InterruptedException e) {
                    Throwables.propagate(e);
                }
            }
        }
        return trackIndex;
    }

    synchronized public void writeSampleData(int trackIndex, ByteBuffer byteBuf,
                                MediaCodec.BufferInfo bufferInfo) {
        checkState(isStarted(), "Muxer not started");
        muxer.writeSampleData(trackIndex, byteBuf, bufferInfo);
    }

    public void release() {
        if (muxer != null) {
            try {
                muxer.stop();
            } catch (Exception e) {
            }
            muxer.release();
        }
    }

    private boolean isStarted() {
        return trackCounter == totalTracks;
    }
}

And the class responsible for writing to MediaCodec encoder is the following:

import android.media.MediaCodec;
import android.media.MediaFormat;

import com.google.common.base.Throwables;

import java.io.IOException;
import java.nio.ByteBuffer;

import static com.google.common.base.Preconditions.checkNotNull;
import static com.google.common.base.Preconditions.checkState;

/**
 * Class responsible for encoding.
 */
public class Encoder {
    private final MediaCodec encoder;
    private final Muxer muxer;
    private final MediaCodec.BufferInfo bufferInfo;
    private int trackIndex;


    public Encoder(String mimeType, MediaFormat format, Muxer muxer) {
        checkNotNull(mimeType);
        checkNotNull(format);
        checkNotNull(muxer);

        try {
            encoder = MediaCodec.createEncoderByType(mimeType);
            encoder.configure(format, null, null, MediaCodec.CONFIGURE_FLAG_ENCODE);

            this.muxer = muxer;
            bufferInfo = new MediaCodec.BufferInfo();
        } catch (IOException e) {
            throw Throwables.propagate(e);
        }
    }

    public MediaCodec getEncoder() {
        return encoder;
    }

    /**
     * Extracts all pending data from the encoder and forwards it to the muxer.
     * <p/>
     * If endOfStream is not set, this returns when there is no more data to drain.  If it
     * is set, we send EOS to the encoder, and then iterate until we see EOS on the output.
     * Calling this with endOfStream set should be done once, right before stopping the muxer.
     * <p/>
     * We're just using the muxer to get a .mp4 file (instead of a raw H.264 stream).
     */
    public void drain(boolean endOfStream) {
        final int TIMEOUT_USEC = 10000;

        if (endOfStream) {
            encoder.signalEndOfInputStream();
        }

        ByteBuffer[] encoderOutputBuffers = encoder.getOutputBuffers();
        while (true) {
            int encoderStatus = encoder.dequeueOutputBuffer(bufferInfo, TIMEOUT_USEC);
            if (encoderStatus == MediaCodec.INFO_TRY_AGAIN_LATER) {
                // no output available yet
                if (!endOfStream) {
                    break;      // out of while
                }
            } else if (encoderStatus == MediaCodec.INFO_OUTPUT_BUFFERS_CHANGED) {
                // not expected for an encoder
                encoderOutputBuffers = encoder.getOutputBuffers();
            } else if (encoderStatus == MediaCodec.INFO_OUTPUT_FORMAT_CHANGED) {
                // now that we have the Magic Goodies, start the muxer
                trackIndex = muxer.addTrack(encoder.getOutputFormat());
            } else if (encoderStatus < 0) {
                // let's ignore it
            } else {
                ByteBuffer encodedData = encoderOutputBuffers[encoderStatus];
                checkState(encodedData != null, "encoderOutputBuffer %s was null", encoderStatus);

                if ((bufferInfo.flags & MediaCodec.BUFFER_FLAG_CODEC_CONFIG) != 0) {
                    // The codec config data was pulled out and fed to the muxer when we got
                    // the INFO_OUTPUT_FORMAT_CHANGED status.  Ignore it.
                    bufferInfo.size = 0;
                }

                if (bufferInfo.size != 0) {
                    // adjust the ByteBuffer values to match BufferInfo (not needed?)
                    encodedData.position(bufferInfo.offset);
                    encodedData.limit(bufferInfo.offset + bufferInfo.size);

                    muxer.writeSampleData(trackIndex, encodedData, bufferInfo);
                }

                encoder.releaseOutputBuffer(encoderStatus, false);

                if ((bufferInfo.flags & MediaCodec.BUFFER_FLAG_END_OF_STREAM) != 0) {
                    break;      // out of while
                }
            }
        }
    }

    public void release() {
        if (encoder != null) {
            try {
                encoder.stop();
            } catch (Exception e) {
            }
            encoder.release();
        }
    }
}

Any idea why it could be failing when running concurrently?

Ok, I have achieved the ultimate goal of the original poster finally. The issue was as I expected and it had to do the with timestamps that were being generated for the Audio track not matching exactly what the video track was giving us.

My solution was to pass the Surface time stamp that our VideoEncoder was using to store in its BufferInfo into the AudioEncoder as well. Instead of calculating that time stamp based on running time of the thread which the Original poster was doing. I just took the Timestamp from the surface and used that as my AudioEncoder BufferInfo time stamp. You must make sure that your buffer limit for you audio recorder is set large enough to handle that since we will not be receiving the audio frames at the sample rate but instead the frame rate of the video. This is trivial to figure out.

To be clear, the Audio and the Video encoding still take place on separate threads, but whenever I make a call to mVideoEncoder.onFrameAvailable to send a message to the video encoder thread with the time stamp of the surface. I do the same thing for the AudioEncoder thread with the TimeStamp of the surface texture that we use for video encoding.This has the desired result of a completely functional MP4 video with both audio and video tracks without the stuttering that was happening originally. I hope this helps anyone that is currently having a similar problem or has in the past.

Issue must be that you call muxer.addTrack(encoder.getOutputFormat()) when other thread already have started to write sample data ( ..muxer.writeSampleData(trackIndex, encodedData, bufferInfo) ). This causes IllegalStateException in MediaMuxer, but you are not catching it, just calling releaseAudio() in finally section.

  • You should try to synchronise threads. Wait for both threads calls muxer.addTrack(encoder.getOutputFormat()) and then allow threads writing samples by muxer.writeSampleData(trackIndex, encodedData, bufferInfo) .

  • Or run audio encoding in the same thread as video encoding.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM