如何多路復用（合並）視頻和音頻，以便音頻將在輸出視頻中循環，以防持續時間太短？

Question

背景

我需要將一個視頻文件和一個音頻文件合並為一個視頻文件，以便：

輸出視頻文件將與輸入視頻文件的持續時間相同
輸出文件中的音頻將只屬於輸入音頻文件。 如果它太短，它將循環到最后（如果需要，可以在最后停止）。 這意味着一旦音頻播放完畢而視頻還沒有播放，我應該一次又一次地播放，直到視頻結束（音頻的串聯）。

正如我所讀到的，這種合並操作的技術術語稱為“多路復用”。

例如，假設我們有一個 10 秒的輸入視頻，一個 4 秒的音頻文件，輸出視頻將為 10 秒（始終與輸入視頻相同），音頻將播放 2.5 次（前 2覆蓋前 8 秒，然后覆蓋其余 4 秒中的 2 秒）。

問題

雖然我找到了如何多路復用視頻和音頻的解決方案（此處），但我遇到了多個問題：

我不知道如何在需要時循環編寫音頻內容。 無論我嘗試什么，它總是給我一個錯誤
輸入文件必須是特定的文件格式。 否則，它可能會拋出異常，或者（在極少數情況下）更糟：創建一個包含黑色內容的視頻文件。 甚至更多：有時“.mkv”文件（例如）可能沒問題，有時它不會被接受（兩者都可以在視頻播放器應用程序上播放）。
當前代碼處理緩沖區而不是真正的持續時間。 這意味着在許多情況下，即使我不應該混合音頻，我也可能會停止混合音頻，與原始視頻相比，輸出視頻文件的音頻內容會更短，即使視頻足夠長。

我試過的

我嘗試使用以下方法使音頻的 MediaExtractor 在每次到達結尾時都轉到開頭：

 if (audioBufferInfo.size < 0) { Log.d("AppLog", "reached end of audio, looping...") audioExtractor.seekTo(0, MediaExtractor.SEEK_TO_CLOSEST_SYNC) audioBufferInfo.size = audioExtractor.readSampleData(audioBuf, 0) }

為了檢查文件的類型，我嘗試使用MediaMetadataRetriever然后檢查 mime-type。 我認為支持的那些在文檔（這里）中可用，因為那些標有“編碼器”。 不確定這一點。 我也不知道那里提到的哪種 mime 類型屬於哪種類型。
我還嘗試重新初始化與音頻相關的所有內容，但它也不起作用。

這是我當前的多路復用代碼（此處提供完整示例項目）：

object VideoAndAudioMuxer {
    //   based on:  https://stackoverflow.com/a/31591485/878126
    @WorkerThread
    fun joinVideoAndAudio(videoFile: File, audioFile: File, outputFile: File): Boolean {
        try {
            //            val videoMediaMetadataRetriever = MediaMetadataRetriever()
            //            videoMediaMetadataRetriever.setDataSource(videoFile.absolutePath)
            //            val videoDurationInMs =
            //                videoMediaMetadataRetriever.extractMetadata(MediaMetadataRetriever.METADATA_KEY_DURATION).toLong()
            //            val videoMimeType =
            //                videoMediaMetadataRetriever.extractMetadata(MediaMetadataRetriever.METADATA_KEY_MIMETYPE)
            //            val audioMediaMetadataRetriever = MediaMetadataRetriever()
            //            audioMediaMetadataRetriever.setDataSource(audioFile.absolutePath)
            //            val audioDurationInMs =
            //                audioMediaMetadataRetriever.extractMetadata(MediaMetadataRetriever.METADATA_KEY_DURATION).toLong()
            //            val audioMimeType =
            //                audioMediaMetadataRetriever.extractMetadata(MediaMetadataRetriever.METADATA_KEY_MIMETYPE)
            //            Log.d(
            //                "AppLog",
            //                "videoDuration:$videoDurationInMs audioDuration:$audioDurationInMs videoMimeType:$videoMimeType audioMimeType:$audioMimeType"
            //            )
            //            videoMediaMetadataRetriever.release()
            //            audioMediaMetadataRetriever.release()
            outputFile.delete()
            outputFile.createNewFile()
            val muxer = MediaMuxer(outputFile.absolutePath, MediaMuxer.OutputFormat.MUXER_OUTPUT_MPEG_4)
            val sampleSize = 256 * 1024
            //video
            val videoExtractor = MediaExtractor()
            videoExtractor.setDataSource(videoFile.absolutePath)
            videoExtractor.selectTrack(0)
            videoExtractor.seekTo(0, MediaExtractor.SEEK_TO_CLOSEST_SYNC)
            val videoFormat = videoExtractor.getTrackFormat(0)
            val videoTrack = muxer.addTrack(videoFormat)
            val videoBuf = ByteBuffer.allocate(sampleSize)
            val videoBufferInfo = MediaCodec.BufferInfo()
//            Log.d("AppLog", "Video Format $videoFormat")
            //audio
            val audioExtractor = MediaExtractor()
            audioExtractor.setDataSource(audioFile.absolutePath)
            audioExtractor.selectTrack(0)
            audioExtractor.seekTo(0, MediaExtractor.SEEK_TO_CLOSEST_SYNC)
            val audioFormat = audioExtractor.getTrackFormat(0)
            val audioTrack = muxer.addTrack(audioFormat)
            val audioBuf = ByteBuffer.allocate(sampleSize)
            val audioBufferInfo = MediaCodec.BufferInfo()
//            Log.d("AppLog", "Audio Format $audioFormat")
            //
            muxer.start()
//            Log.d("AppLog", "muxing video&audio...")
            //            val minimalDurationInMs = Math.min(videoDurationInMs, audioDurationInMs)
            while (true) {
                videoBufferInfo.size = videoExtractor.readSampleData(videoBuf, 0)
                audioBufferInfo.size = audioExtractor.readSampleData(audioBuf, 0)
                if (audioBufferInfo.size < 0) {
                    //                    Log.d("AppLog", "reached end of audio, looping...")
                    //TODO somehow start from beginning of the audio again, for looping till the video ends
                    //                    audioExtractor.seekTo(0, MediaExtractor.SEEK_TO_CLOSEST_SYNC)
                    //                    audioBufferInfo.size = audioExtractor.readSampleData(audioBuf, 0)
                }
                if (videoBufferInfo.size < 0 || audioBufferInfo.size < 0) {
//                    Log.d("AppLog", "reached end of video")
                    videoBufferInfo.size = 0
                    audioBufferInfo.size = 0
                    break
                } else {
                    //                    val donePercentage = videoExtractor.sampleTime / minimalDurationInMs / 10L
                    //                    Log.d("AppLog", "$donePercentage")
                    // video muxing
                    videoBufferInfo.presentationTimeUs = videoExtractor.sampleTime
                    videoBufferInfo.flags = videoExtractor.sampleFlags
                    muxer.writeSampleData(videoTrack, videoBuf, videoBufferInfo)
                    videoExtractor.advance()
                    // audio muxing
                    audioBufferInfo.presentationTimeUs = audioExtractor.sampleTime
                    audioBufferInfo.flags = audioExtractor.sampleFlags
                    muxer.writeSampleData(audioTrack, audioBuf, audioBufferInfo)
                    audioExtractor.advance()
                }
            }
            muxer.stop()
            muxer.release()
//            Log.d("AppLog", "success")
            return true
        } catch (e: Exception) {
            e.printStackTrace()
//            Log.d("AppLog", "Error " + e.message)
        }
        return false
    }
}

我還嘗試使用 FFMPEG 庫（這里和這里），看看如何做到這一點。 它工作正常，但它有一些可能的問題：該庫似乎占用了大量空間，令人討厭的許可條款，並且出於某種原因，示例無法播放我必須創建的輸出文件，除非我刪除了命令將使轉換速度慢得多。 我真的更喜歡使用內置的 API 而不是使用這個庫，即使它是一個非常強大的庫......而且，對於某些輸入文件，它似乎沒有循環......

問題

如何多路復用視頻和音頻文件，以便在音頻比視頻更短（持續時間）時音頻將循環播放？
我該怎么做才能在視頻結束時准確地剪切音頻（視頻和音頻上都沒有剩余部分）？
如何在調用此函數之前檢查當前設備是否可以處理給定的輸入文件並實際對它們進行多路復用？ 有沒有一種方法可以在運行時檢查這種操作支持的方法，而不是依賴於將來可能會更改的文檔列表？

Answer 1

我有同樣的場景。

1：當audioBufferInfo.size < 0 時，尋求開始。 但請記住，您需要累積presentationTimeUs 。
2：獲取視頻時長，當音頻循環到時長（也使用presentationTimeUs ）時，剪切。
3：音頻文件需要是MediaFormat.MIMETYPE_AUDIO_AMR_NB或MediaFormat.MIMETYPE_AUDIO_AMR_WB或MediaFormat.MIMETYPE_AUDIO_AAC 。 在我的測試機器上，它運行良好。

這是代碼：

private fun muxing(musicName: String) {
    val saveFile = File(DirUtils.getPublicMediaPath(), "$saveName.mp4")
    if (saveFile.exists()) {
        saveFile.delete()
        PhotoHelper.sendMediaScannerBroadcast(saveFile)
    }
    try {
        // get the video file duration in microseconds
        val duration = getVideoDuration(mSaveFile!!.absolutePath)

        saveFile.createNewFile()

        val videoExtractor = MediaExtractor()
        videoExtractor.setDataSource(mSaveFile!!.absolutePath)

        val audioExtractor = MediaExtractor()
        val afdd = MucangConfig.getContext().assets.openFd(musicName)
        audioExtractor.setDataSource(afdd.fileDescriptor, afdd.startOffset, afdd.length)

        val muxer = MediaMuxer(saveFile.absolutePath, MediaMuxer.OutputFormat.MUXER_OUTPUT_MPEG_4)

        videoExtractor.selectTrack(0)
        val videoFormat = videoExtractor.getTrackFormat(0)
        val videoTrack = muxer.addTrack(videoFormat)

        audioExtractor.selectTrack(0)
        val audioFormat = audioExtractor.getTrackFormat(0)
        val audioTrack = muxer.addTrack(audioFormat)

        var sawEOS = false
        val offset = 100
        val sampleSize = 1000 * 1024
        val videoBuf = ByteBuffer.allocate(sampleSize)
        val audioBuf = ByteBuffer.allocate(sampleSize)
        val videoBufferInfo = MediaCodec.BufferInfo()
        val audioBufferInfo = MediaCodec.BufferInfo()

        videoExtractor.seekTo(0, MediaExtractor.SEEK_TO_CLOSEST_SYNC)
        audioExtractor.seekTo(0, MediaExtractor.SEEK_TO_CLOSEST_SYNC)

        muxer.start()

        val frameRate = videoFormat.getInteger(MediaFormat.KEY_FRAME_RATE)
        val videoSampleTime = 1000 * 1000 / frameRate

        while (!sawEOS) {
            videoBufferInfo.offset = offset
            videoBufferInfo.size = videoExtractor.readSampleData(videoBuf, offset)

            if (videoBufferInfo.size < 0) {
                sawEOS = true
                videoBufferInfo.size = 0

            } else {
                videoBufferInfo.presentationTimeUs += videoSampleTime
                videoBufferInfo.flags = videoExtractor.sampleFlags
                muxer.writeSampleData(videoTrack, videoBuf, videoBufferInfo)
                videoExtractor.advance()
            }
        }

        var sawEOS2 = false
        var sampleTime = 0L
        while (!sawEOS2) {

            audioBufferInfo.offset = offset
            audioBufferInfo.size = audioExtractor.readSampleData(audioBuf, offset)

            if (audioBufferInfo.presentationTimeUs >= duration) {
                sawEOS2 = true
                audioBufferInfo.size = 0
            } else {
                if (audioBufferInfo.size < 0) {
                    sampleTime = audioBufferInfo.presentationTimeUs
                    audioExtractor.seekTo(0, MediaExtractor.SEEK_TO_CLOSEST_SYNC)
                    continue
                }
            }
            audioBufferInfo.presentationTimeUs = audioExtractor.sampleTime + sampleTime
            audioBufferInfo.flags = audioExtractor.sampleFlags
            muxer.writeSampleData(audioTrack, audioBuf, audioBufferInfo)
            audioExtractor.advance()
        }

        muxer.stop()
        muxer.release()
        videoExtractor.release()
        audioExtractor.release()
        afdd.close()
    } catch (e: Exception) {
        LogUtils.e(TAG, "Mixer Error:" + e.message)
    }
}

如何多路復用（合並）視頻和音頻，以便音頻將在輸出視頻中循環，以防持續時間太短？

問題描述

背景

問題

我試過的

問題

1 個解決方案

解決方案1
2 2019-09-19 06:15:34

如何多路復用（合並）視頻和音頻，以便音頻將在輸出視頻中循環，以防持續時間太短？

問題描述

背景

問題

我試過的

問題

1 個解決方案

解決方案1 2 2019-09-19 06:15:34

解決方案1
2 2019-09-19 06:15:34