简体   繁体   English

了解MediaCodec和MediaExtractor

[英]Understanding of MediaCodec and MediaExtractor

I want to do some processing on audio files without playing them, just math. 我想对音频文件进行一些处理,而不是播放它们,只是数学。 I doubt if I'm doing right and have several questions. 我怀疑我做得对,有几个问题。 I read some examples but most of them is about video streaming and there is no working with raw data at all. 我读了一些例子,但大多数是关于视频流的,根本没有使用原始数据。

  1. I prepared an mp3 file that has 2 identical channels, ie it is stereo but the left and the right are the same. 我准备了一个有两个相同频道的mp3文件,即它是立体声,但左边和右边是相同的。 After decoding I expected to get buffer with pairs of equal numbers because PCM-16 stores samples of channels alternately, like { L R L R L R ...}, right? 解码后,我希望得到具有相同数字对的缓冲区,因为PCM-16交替存储通道样本,如{ L R L R L R ...},对吗? Eg: 例如:

    { 105 105 601 601 -243 -243 -484 -484 ...}. { 105 105 601 601 -243 -243 -484 -484 ......}。

    But I get pairs of close numbers but not equal: 但我得到一对接近但不相等的数字:

    { -308 -264 -1628 -1667 -2568 -2550 -4396 -4389 } { -308 -264 -1628 -1667 -2568 -2550 -4396 -4389 }

    Does mp3 algorithms encode the same values differently or why? mp3算法是否以不同的方式编码相同的值或为什么?

  2. I want to process data in packs of 1024 samples. 我想以1024个样本的包处理数据。 If there will be not enough samples for another pack I want to save the rest until next batch of raw data (see mExcess in code). 如果没有足够的样本用于另一个包,我想保留其余的样本直到下一批原始数据(参见代码中的mExcess )。 Is there guarantee that order will be kept? 是否保证订单会被保留?

  3. I used to understand "sample" as every single value of audio data. 我曾经将“sample”理解为音频数据的每一个值。 Here I see MediaExtractor::readSampleData and MediaExtractor::advance methods. 在这里,我看到MediaExtractor::readSampleDataMediaExtractor::advance方法。 The first returns ~2000 values, in description of the second said "Advance to the next sample". 第一个返回〜2000个值,在第二个描述“前进到下一个样本”中。 Is this just overlap of naming? 这只是命名的重叠吗? I saw couple of examples where these methods are called in pair in loop. 我看到了几个例子,其中这些方法在配对循环中被调用。 Is my usage correct? 我的用法是否正确?

Here is my code: 这是我的代码:

public static void foo(String filepath) throws IOException {
    final int SAMPLES_PER_CHUNK = 1024;

    MediaExtractor mediaExtractor = new MediaExtractor();
    mediaExtractor.setDataSource(filepath);
    MediaFormat mediaFormat = mediaExtractor.getTrackFormat(0);
    mediaExtractor.release();

    MediaCodecList mediaCodecList = new MediaCodecList(MediaCodecList.ALL_CODECS);
    mediaFormat.setString(MediaFormat.KEY_FRAME_RATE, null);
    String codecName = mediaCodecList.findDecoderForFormat(mediaFormat);
    mediaFormat.setInteger(MediaFormat.KEY_FRAME_RATE, 0);  // MediaCodec crashes with JNI
                                                            // error if FRAME_RATE is null
    MediaCodec mediaCodec = MediaCodec.createByCodecName(codecName);
    mediaCodec.setCallback(new MediaCodec.Callback() {
        private MediaExtractor mExtractor;
        private short[] mExcess;

        @Override
        public void onInputBufferAvailable(MediaCodec codec, int index) {
            if (mExtractor == null) {
                mExtractor = new MediaExtractor();
                try {
                    mExtractor.setDataSource(filepath);
                    mExtractor.selectTrack(0);
                } catch (IOException e) {
                    e.printStackTrace();
                }
                mExcess = new short[0];
            }
            ByteBuffer in = codec.getInputBuffer(index);
            in.clear();
            int sampleSize = mExtractor.readSampleData(in, 0);
            if (sampleSize > 0) {
                boolean isOver = !mExtractor.advance();
                codec.queueInputBuffer(
                        index,
                        0,
                        sampleSize,
                        mExtractor.getSampleTime(),
                        isOver ? MediaCodec.BUFFER_FLAG_END_OF_STREAM : 0);
            } else {
                int helloAmaBreakpoint = 1;
            }
        }

        @Override
        public void onOutputBufferAvailable(
                MediaCodec codec,
                int index,
                MediaCodec.BufferInfo info) {
            ByteBuffer tmp = codec.getOutputBuffer(index);
            if (tmp.limit() == 0) return;

            ShortBuffer out = tmp.order(ByteOrder.nativeOrder()).asShortBuffer();
            // Prepend the remainder from previous batch to the new data
            short[] buf = new short[mExcess.length + out.limit()];
            System.arraycopy(mExcess, 0, buf, 0, mExcess.length);
            out.get(buf, mExcess.length, out.limit());

            final int channelCount
                    = codec.getOutputFormat().getInteger(MediaFormat.KEY_CHANNEL_COUNT);
            for (
                    int offset  = 0;
                    offset + SAMPLES_PER_CHUNK * channelCount < buf.length;
                    offset += SAMPLES_PER_CHUNK * channelCount) {

                double[] x = new double[SAMPLES_PER_CHUNK];  // left channel
                double[] y = new double[SAMPLES_PER_CHUNK];  // right channel
                switch (channelCount) {
                    case 1:  // if 1 channel then make 2 identical arrays
                        for (int i = 0; i < SAMPLES_PER_CHUNK; ++i) {
                            x[i] = (double) buf[offset + i];
                            y[i] = (double) buf[offset + i];
                        }
                        break;
                    case 2:  // if 2 channels then read values alternately
                        for (int i = 0; i < SAMPLES_PER_CHUNK; ++i) {
                            x[i] = (double) buf[offset + i * 2];
                            y[i] = (double) buf[offset + i * 2 + 1];
                        }
                        break;
                    default:
                        throw new IllegalStateException("No algorithm for " + channelCount + " channels");
                }

                /// ... some processing ... ///
            }

            // Save the rest until next batch of raw data
            int samplesLeft = buf.length % (SAMPLES_PER_CHUNK * channelCount);
            mExcess = new short[samplesLeft];
            System.arraycopy(
                    buf,
                    buf.length - samplesLeft,
                    mExcess,
                    0,
                    samplesLeft);

            codec.releaseOutputBuffer(index, false);
            if ((info.flags & MediaCodec.BUFFER_FLAG_END_OF_STREAM) > 0) {
                codec.stop();
                codec.release();
                mExtractor.release();
            }
        }

        @Override
        public void onError(MediaCodec codec, MediaCodec.CodecException e) {

        }

        @Override
        public void onOutputFormatChanged(MediaCodec codec, MediaFormat format) {

        }
    });

    mediaFormat.setInteger(MediaFormat.KEY_PCM_ENCODING, AudioFormat.ENCODING_PCM_16BIT);
    mediaCodec.configure(mediaFormat, null, null, 0);
    mediaCodec.start();
}

Quick code review is also welcome. 我们也欢迎快速代码审查。

  1. I'm exactly sure of why it would code them this way, but I think that small variance is within the expected tolerance. 我完全确定为什么它会以这种方式对它们进行编码,但我认为小的变化在预期的容差范围内。 Keep in mind that mp3 being a lossy codec, the output values from the decoder won't be the same as the input, as long as the audible representation is close enough. 请记住,mp3是一个有损编解码器,只要声音表示足够接近,解码器的输出值就不会与输入相同。 But that doesn't indicate why the two channels would end up subtly different. 但这并不能说明为什么两个频道最终会略有不同。

  2. Yes, the individual order of decoded frames will be the same. 是的,解码帧的个别顺序是相同的。 The exact values won't match but the sound of it should be similar. 确切的值不匹配,但它的声音应该相似。

  3. In MediaExtractor, a sample is one encoded packet of data, which you should feed to the decoder. 在MediaExtractor中,样本是一个编码的数据包,您应该将其提供给解码器。 For mp3, this would typically be 1152 samples (per channel). 对于mp3,这通常是1152个样本(每个通道)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM