使用Android的MediaCodec和MediaMuxer复用AAC音频

Question

I'm modifying an Android Framework example to package the elementary AAC streams produced by MediaCodec into a standalone .mp4 file. 我正在修改一个Android Framework示例，将MediaCodec生成的基本AAC流打包成一个独立的.mp4文件。 I'm using a single MediaMuxer instance containing one AAC track generated by a MediaCodec instance. 我正在使用一个MediaMuxer实例，其中包含一个由MediaCodec实例生成的AAC轨道。

However I always eventually get an error message on a call to mMediaMuxer.writeSampleData(trackIndex, encodedData, bufferInfo) : 但是，在调用mMediaMuxer.writeSampleData(trackIndex, encodedData, bufferInfo)我总是最终收到错误消息：

E/MPEG4Writer﹕timestampUs 0 < lastTimestampUs XXXXX for Audio track

When I queue the raw input data in mCodec.queueInputBuffer(...) I provide 0 as the timestamp value per the Framework Example (I've also tried using monotonically increasing timestamp values with the same result. I've successfully encoded raw camera frames to h264/mp4 files with this same method). 当我在mCodec.queueInputBuffer(...)排队原始输入数据时，我提供0作为每个框架示例的时间戳值（我也尝试使用单调增加的时间戳值，结果相同。我已成功编码原始相机使用相同的方法将帧转换为h264 / mp4文件）。

Check out the full source 查看完整的来源

Most relevant snippet: 最相关的片段：

private static void testEncoder(String componentName, MediaFormat format, Context c) {
    int trackIndex = 0;
    boolean mMuxerStarted = false;
    File f = FileUtils.createTempFileInRootAppStorage(c, "aac_test_" + new Date().getTime() + ".mp4");
    MediaCodec codec = MediaCodec.createByCodecName(componentName);

    try {
        codec.configure(
                format,
                null /* surface */,
                null /* crypto */,
                MediaCodec.CONFIGURE_FLAG_ENCODE);
    } catch (IllegalStateException e) {
        Log.e(TAG, "codec '" + componentName + "' failed configuration.");

    }

    codec.start();

    try {
        mMediaMuxer = new MediaMuxer(f.getAbsolutePath(), MediaMuxer.OutputFormat.MUXER_OUTPUT_MPEG_4);
    } catch (IOException ioe) {
        throw new RuntimeException("MediaMuxer creation failed", ioe);
    }

    ByteBuffer[] codecInputBuffers = codec.getInputBuffers();
    ByteBuffer[] codecOutputBuffers = codec.getOutputBuffers();

    int numBytesSubmitted = 0;
    boolean doneSubmittingInput = false;
    int numBytesDequeued = 0;

    while (true) {
        int index;

        if (!doneSubmittingInput) {
            index = codec.dequeueInputBuffer(kTimeoutUs /* timeoutUs */);

            if (index != MediaCodec.INFO_TRY_AGAIN_LATER) {
                if (numBytesSubmitted >= kNumInputBytes) {
                    Log.i(TAG, "queueing EOS to inputBuffer");
                    codec.queueInputBuffer(
                            index,
                            0 /* offset */,
                            0 /* size */,
                            0 /* timeUs */,
                            MediaCodec.BUFFER_FLAG_END_OF_STREAM);

                    if (VERBOSE) {
                        Log.d(TAG, "queued input EOS.");
                    }

                    doneSubmittingInput = true;
                } else {
                    int size = queueInputBuffer(
                            codec, codecInputBuffers, index);

                    numBytesSubmitted += size;

                    if (VERBOSE) {
                        Log.d(TAG, "queued " + size + " bytes of input data.");
                    }
                }
            }
        }

        MediaCodec.BufferInfo info = new MediaCodec.BufferInfo();
        index = codec.dequeueOutputBuffer(info, kTimeoutUs /* timeoutUs */);

        if (index == MediaCodec.INFO_TRY_AGAIN_LATER) {
        } else if (index == MediaCodec.INFO_OUTPUT_FORMAT_CHANGED) {
            MediaFormat newFormat = codec.getOutputFormat();
            trackIndex = mMediaMuxer.addTrack(newFormat);
            mMediaMuxer.start();
            mMuxerStarted = true;
        } else if (index == MediaCodec.INFO_OUTPUT_BUFFERS_CHANGED) {
            codecOutputBuffers = codec.getOutputBuffers();
        } else {
            // Write to muxer
            ByteBuffer encodedData = codecOutputBuffers[index];
            if (encodedData == null) {
                throw new RuntimeException("encoderOutputBuffer " + index +
                        " was null");
            }

            if ((info.flags & MediaCodec.BUFFER_FLAG_CODEC_CONFIG) != 0) {
                // The codec config data was pulled out and fed to the muxer when we got
                // the INFO_OUTPUT_FORMAT_CHANGED status.  Ignore it.
                if (VERBOSE) Log.d(TAG, "ignoring BUFFER_FLAG_CODEC_CONFIG");
                info.size = 0;
            }

            if (info.size != 0) {
                if (!mMuxerStarted) {
                    throw new RuntimeException("muxer hasn't started");
                }

                // adjust the ByteBuffer values to match BufferInfo (not needed?)
                encodedData.position(info.offset);
                encodedData.limit(info.offset + info.size);

                mMediaMuxer.writeSampleData(trackIndex, encodedData, info);
                if (VERBOSE) Log.d(TAG, "sent " + info.size + " audio bytes to muxer with pts " + info.presentationTimeUs);
            }

            codec.releaseOutputBuffer(index, false);

            // End write to muxer
            numBytesDequeued += info.size;

            if ((info.flags & MediaCodec.BUFFER_FLAG_END_OF_STREAM) != 0) {
                if (VERBOSE) {
                    Log.d(TAG, "dequeued output EOS.");
                }
                break;
            }

            if (VERBOSE) {
                Log.d(TAG, "dequeued " + info.size + " bytes of output data.");
            }
        }
    }

    if (VERBOSE) {
        Log.d(TAG, "queued a total of " + numBytesSubmitted + "bytes, "
                + "dequeued " + numBytesDequeued + " bytes.");
    }

    int sampleRate = format.getInteger(MediaFormat.KEY_SAMPLE_RATE);
    int channelCount = format.getInteger(MediaFormat.KEY_CHANNEL_COUNT);
    int inBitrate = sampleRate * channelCount * 16;  // bit/sec
    int outBitrate = format.getInteger(MediaFormat.KEY_BIT_RATE);

    float desiredRatio = (float)outBitrate / (float)inBitrate;
    float actualRatio = (float)numBytesDequeued / (float)numBytesSubmitted;

    if (actualRatio < 0.9 * desiredRatio || actualRatio > 1.1 * desiredRatio) {
        Log.w(TAG, "desiredRatio = " + desiredRatio
                + ", actualRatio = " + actualRatio);
    }


    codec.release();
    mMediaMuxer.stop();
    mMediaMuxer.release();
    codec = null;
}

Update: I've found a root symptom I think lies within MediaCodec .: 更新：我发现了一个我认为属于MediaCodec的根症状。：

I send presentationTimeUs=1000 to queueInputBuffer(...) but receive info.presentationTimeUs= 33219 after calling MediaCodec.dequeueOutputBuffer(info, timeoutUs) . 我将presentationTimeUs=1000发送到queueInputBuffer(...)但在调用MediaCodec.dequeueOutputBuffer(info, timeoutUs)后接收info.presentationTimeUs= 33219 。 fadden left a helpful comment related to this behavior. fadden留下了与此行为相关的有用评论。

Answer 1

Thanks to fadden's help I've got a proof-of-concept audio encoder and video+audio encoder on Github. 感谢fadden的帮助，我在Github上有一个概念验证音频编码器和视频+音频编码器。 In summary: 综上所述：

Send AudioRecord 's samples to a MediaCodec + MediaMuxer wrapper. 将AudioRecord的样本发送到MediaCodec + MediaMuxer包装器。 Using the system time at audioRecord.read(...) works sufficiently well as an audio timestamp, provided you poll often enough to avoid filling up AudioRecord's internal buffer (to avoid drift between the time you call read and the time AudioRecord recorded the samples). 使用audioRecord.read(...)的系统时间可以很好地作为音频时间戳，只要您经常轮询以避免填充AudioRecord的内部缓冲区（以避免在您调用读取的时间和AudioRecord记录样本的时间之间漂移））。 Too bad AudioRecord doesn't directly communicate timestamps... 太糟糕的AudioRecord没有直接传达时间戳......

// Setup AudioRecord
while (isRecording) {
    audioPresentationTimeNs = System.nanoTime();
    audioRecord.read(dataBuffer, 0, samplesPerFrame);
    hwEncoder.offerAudioEncoder(dataBuffer.clone(), audioPresentationTimeNs);
}

Note that AudioRecord only guarantees support for 16 bit PCM samples , though MediaCodec.queueInputBuffer takes input as byte[] . 请注意，AudioRecord 仅保证支持16位PCM采样，但MediaCodec.queueInputBuffer将输入作为byte[] 。 Passing a byte[] to audioRecord.read(dataBuffer,...) will ~~truncate~~ split the 16 bit samples into 8 bit for you. 将byte[]传递给audioRecord.read(dataBuffer,...)将截断将16位样本拆分为8位。

I found that polling in this way still occasionally generated a timestampUs XXX < lastTimestampUs XXX for Audio track error, so I included some logic to keep track of the bufferInfo.presentationTimeUs reported by mediaCodec.dequeueOutputBuffer(bufferInfo, timeoutMs) and adjust if necessary before calling mediaMuxer.writeSampleData(trackIndex, encodedData, bufferInfo) . 我发现这种方式在该投票站内还偶尔产生timestampUs XXX < lastTimestampUs XXX for Audio track的错误，所以我包括一些逻辑来跟踪的bufferInfo.presentationTimeUs通过报道mediaCodec.dequeueOutputBuffer(bufferInfo, timeoutMs)和必要时进行调整之前调用mediaMuxer.writeSampleData(trackIndex, encodedData, bufferInfo) 。

Answer 2

The code from above answer https://stackoverflow.com/a/18966374/6463821 also provides timestampUs XXX < lastTimestampUs XXX for Audio track error, because if you read from AudioRecord`s buffer faster then need, duration between generated timstamps will smaller than real duration between audio samples. 上面的代码回答https://stackoverflow.com/a/18966374/6463821也提供了timestampUs XXX < lastTimestampUs XXX for Audio track错误，因为如果你从AudioRecord的缓冲区读取更快然后需要，生成的timstamps之间的持续时间将小于音频样本间的实际持续时间

So my solution for this issue is generate first timstamp and each next sample increase timestamp by duration of your sample ( depends on bit-rate, audio format, channel config) . 因此，我对此问题的解决方案是生成第一个timstamp，每个下一个样本按样本的持续时间增加时间戳（取决于比特率，音频格式，通道配置）。

BUFFER_DURATION_US = 1_000_000 * (ARR_SIZE / AUDIO_CHANNELS) / SAMPLE_AUDIO_RATE_IN_HZ;

... ...

long firstPresentationTimeUs = System.nanoTime() / 1000;

... ...

audioRecord.read(shortBuffer, OFFSET, ARR_SIZE);
long presentationTimeUs = count++ * BUFFER_DURATION + firstPresentationTimeUs;

Reading from AudioRecord should be in separate thread, and all read buffers should be added to queue without waiting for encoding or any other actions with them, to prevent losing of audio samples. 从AudioRecord读取应该在单独的线程中，并且应该将所有读取缓冲区添加到队列中而不等待编码或使用它们的任何其他操作，以防止丢失音频样本。

worker =
        new Thread() {

            @Override
            public void run() {
                try {

                    AudioFrameReader reader =
                            new AudioFrameReader(audioRecord);

                    while (!isInterrupted()) {
                        Thread.sleep(10);

                        addToQueue(
                                reader
                                        .read());
                    }

                } catch (InterruptedException e) {
                    Log.w(TAG, "run: ", e);
                }
            }
        };

Answer 3

Issue occured because you receive buffers disorderly : Try to add the following test : 由于您无条件地收到缓冲区而出现问题：尝试添加以下测试：

if(lastAudioPresentationTime == -1) {
    lastAudioPresentationTime = bufferInfo.presentationTimeUs;
}
else if (lastAudioPresentationTime < bufferInfo.presentationTimeUs) {
    lastAudioPresentationTime = bufferInfo.presentationTimeUs;
}
if ((bufferInfo.size != 0) && (lastAudioPresentationTime <= bufferInfo.presentationTimeUs)) {
    if (!mMuxerStarted) {
        throw new RuntimeException("muxer hasn't started");
    }
    // adjust the ByteBuffer values to match BufferInfo (not needed?)
    encodedData.position(bufferInfo.offset);
    encodedData.limit(bufferInfo.offset + bufferInfo.size);
    mMuxer.writeSampleData(trackIndex.index, encodedData, bufferInfo);
}

encoder.releaseOutputBuffer(encoderStatus, false);

使用Android的MediaCodec和MediaMuxer复用AAC音频

问题描述

3 个解决方案

解决方案1
6 已采纳 2013-09-23 18:35:30

解决方案2
3 2017-06-02 14:04:42

解决方案3
0 2014-09-16 17:08:32

使用Android的MediaCodec和MediaMuxer复用AAC音频

问题描述

3 个解决方案

解决方案1 6 已采纳 2013-09-23 18:35:30

解决方案2 3 2017-06-02 14:04:42

解决方案3 0 2014-09-16 17:08:32

解决方案1
6 已采纳 2013-09-23 18:35:30

解决方案2
3 2017-06-02 14:04:42

解决方案3
0 2014-09-16 17:08:32