简体   繁体   English

FFMPEG——使用“amix”将短音频片段与视频结合在一起导致最终视频的声音提前中断

[英]FFMPEG -- using 'amix' to combine short audio clip with a video results in final video's sound cutting off early

I am trying to combine the following:我正在尝试结合以下内容:

(a): 29s video clip that has its own audio that lasts the entire duration (a): 29 秒的视频剪辑,有自己的音频,持续整个持续时间

(b): audio clip I want to play at the start of the video, in conjunction with original audio, and is ~2 seconds long (b):我想在视频开头播放的音频片段,与原始音频一起播放,时长约 2 秒

I successfully use 'amix' to obtain a video at the end with combined audio, but the problem is that the final video's audio cuts off at around 26 out of the 29 seconds of the video and goes silent.我成功地使用“amix”在最后获得了带有组合音频的视频,但问题是最终视频的音频在 29 秒的视频中的大约 26 秒处中断并静音。

What doesn't make any sense is that the resulting video plays as it should, with the audio successfully mixed.没有任何意义的是生成的视频可以正常播放,并且音频已成功混合。 But the output video's audio stream loses the last 3 seconds.但是 output 视频的音频 stream 丢失了最后 3 秒。

Here's the 'amix' command I'm using (sending via subprocess):这是我正在使用的“amix”命令(通过子进程发送):

subprocess.call(['ffmpeg','-i', input.mp4', '-i', "audioclip.mp3", '-filter_complex', 'amix', output.mp4'])

I've also used versions of this command that spell out the -map "0:a" and -map "1:a", or tried using 'amix=inputs=2:duration:longest' among many other additions.我还使用了拼写出 -map "0:a" 和 -map "1:a" 的此命令的版本,或者尝试使用 'amix=inputs=2:duration:longest' 以及许多其他添加项。 All lead to the same problem: the final combined video's audio drops out with 3 seconds remaining in the video, even though the initial 'input.mp4' video has a full 29 out of 29 seconds of audio.所有这些都会导致相同的问题:最终组合视频的音频会在视频中剩余 3 秒时中断,即使最初的“input.mp4”视频在 29 秒的音频中有完整的 29 秒。

Does anyone know why these last several seconds of audio from [a] are missing in the final video?有谁知道为什么最终视频中缺少 [a] 中最后几秒钟的音频?

_________________________________________________________________ ___________________________________________________________________

edit: Below is my output when I run the amix command listed above:编辑:当我运行上面列出的 amix 命令时,下面是我的 output:

Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'RuneBearinstakill_advanced.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf59.20.101
  Duration: 00:00:29.77, start: 0.000000, bitrate: 5441 kb/s
  Stream #0:0[0x1](eng): Video: h264 (High) (avc1 / 0x31637661), yuv420p(tv, bt470bg/bt470bg/smpte170m, progressive), 1920x1080 [SAR 1:1 DAR 16:9], 5304 kb/s, 30 fps, 30 tbr, 15360 tbn (default)
    Metadata:
      handler_name    : Bento4 Video Handler
      vendor_id       : [0][0][0][0]
  Stream #0:1[0x2](eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 128 kb/s (default)
    Metadata:
      handler_name    : Bento4 Sound Handler
      vendor_id       : [0][0][0][0]
[mp3 @ 000001f0c8ec2040] Estimating duration from bitrate, this may be inaccurate
Input #1, mp3, from 'TTS_clip.mp3':
  Duration: 00:00:01.90, start: 0.000000, bitrate: 32 kb/s
  Stream #1:0: Audio: mp3, 24000 Hz, mono, fltp, 32 kb/s
Stream mapping:
  Stream #0:1 (aac) -> amix (graph 0)
  Stream #1:0 (mp3float) -> amix (graph 0)
  amix:default (graph 0) -> Stream #0:0 (aac)
  Stream #0:0 -> #0:1 (h264 (native) -> h264 (libx264))
Press [q] to stop, [?] for help
[libx264 @ 000001f0c8cbe5c0] using SAR=1/1
[libx264 @ 000001f0c8cbe5c0] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
[libx264 @ 000001f0c8cbe5c0] profile High, level 4.0, 4:2:0, 8-bit
[libx264 @ 000001f0c8cbe5c0] 264 - core 164 r3094 bfc87b7 - H.264/MPEG-4 AVC codec - Copyleft 2003-2022 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=24 lookahead_threads=4 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, mp4, to 'RuneBearinstakill_advancedwithtts.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf59.20.101
  Stream #0:0: Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 128 kb/s
    Metadata:
      encoder         : Lavc59.25.100 aac
  Stream #0:1(eng): Video: h264 (avc1 / 0x31637661), yuv420p(tv, bt470bg/bt470bg/smpte170m, progressive), 1920x1080 [SAR 1:1 DAR 16:9], q=2-31, 30 fps, 15360 tbn (default)
    Metadata:
      handler_name    : Bento4 Video Handler
      vendor_id       : [0][0][0][0]
      encoder         : Lavc59.25.100 libx264
    Side data:
      cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: N/A
frame=  893 fps=110 q=-1.0 Lsize=   18717kB time=00:00:29.66 bitrate=5168.5kbits/s speed=3.66x    
video:18256kB audio:433kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.150179%
[aac @ 000001f0c8f9ebc0] Qavg: 921.259
[libx264 @ 000001f0c8cbe5c0] frame I:4     Avg QP:21.33  size: 71366
[libx264 @ 000001f0c8cbe5c0] frame P:633   Avg QP:23.32  size: 23837
[libx264 @ 000001f0c8cbe5c0] frame B:256   Avg QP:25.22  size: 12968
[libx264 @ 000001f0c8cbe5c0] consecutive B-frames: 57.2% 10.3% 10.1% 22.4%
[libx264 @ 000001f0c8cbe5c0] mb I  I16..4: 17.9% 71.4% 10.8%
[libx264 @ 000001f0c8cbe5c0] mb P  I16..4:  6.9% 17.6%  0.8%  P16..4: 43.1%  6.5%  1.5%  0.0%  0.0%    skip:23.6%
[libx264 @ 000001f0c8cbe5c0] mb B  I16..4:  1.5%  4.2%  0.3%  B16..8: 39.7%  4.6%  0.5%  direct: 1.6%  skip:47.6%  L0:55.9% L1:41.8% BI: 2.3%
[libx264 @ 000001f0c8cbe5c0] 8x8 transform intra:69.5% inter:87.3%
[libx264 @ 000001f0c8cbe5c0] coded y,uvDC,uvAC intra: 35.6% 26.8% 0.8% inter: 13.4% 10.8% 0.0%
[libx264 @ 000001f0c8cbe5c0] i16 v,h,dc,p: 21% 37% 12% 30%
[libx264 @ 000001f0c8cbe5c0] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 25% 26% 21%  4%  5%  5%  6%  4%  5%
[libx264 @ 000001f0c8cbe5c0] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 24% 28% 15%  5%  7%  7%  7%  5%  4%
[libx264 @ 000001f0c8cbe5c0] i8c dc,h,v,p: 67% 18% 14%  1%
[libx264 @ 000001f0c8cbe5c0] Weighted P-Frames: Y:0.2% UV:0.0%
[libx264 @ 000001f0c8cbe5c0] ref P L0: 72.3% 15.4%  8.7%  3.6%  0.0%
[libx264 @ 000001f0c8cbe5c0] ref B L0: 88.9%  9.5%  1.6%
[libx264 @ 000001f0c8cbe5c0] ref B L1: 97.7%  2.3%
[libx264 @ 000001f0c8cbe5c0] kb/s:5024.13

And here is the output when I check the stream durations for the input video and the output video , showing how the output video's audio stream is somehow reduced by several seconds after the amix:这是 output,当我检查输入视频的 stream 持续时间和 output 视频时,显示 output 视频的音频 stream 在 amix 后如何以某种方式减少了几秒钟:

Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'RuneBearinstakill_advanced.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf59.20.101
  Duration: 00:00:29.77, start: 0.000000, bitrate: 5403 kb/s
  Stream #0:0[0x1](eng): Video: h264 (High) (avc1 / 0x31637661), yuv420p(tv, bt470bg/bt470bg/smpte170m, progressive), 1920x1080 [SAR 1:1 DAR 16:9], 5266 kb/s, 30 fps, 30 tbr, 15360 tbn (default)
    Metadata:
      handler_name    : Bento4 Video Handler
      vendor_id       : [0][0][0][0]
  Stream #0:1[0x2](eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 128 kb/s (default)
    Metadata:
      handler_name    : Bento4 Sound Handler
      vendor_id       : [0][0][0][0]
[STREAM]
duration=29.766667
[/STREAM]
[STREAM]
duration=29.738000
[/STREAM]

Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'RuneBearinstakill_advancedwithtts.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf59.20.101
  Duration: 00:00:29.77, start: 0.000000, bitrate: 5098 kb/s
  Stream #0:0[0x1](und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 128 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
      vendor_id       : [0][0][0][0]
  Stream #0:1[0x2](eng): Video: h264 (High) (avc1 / 0x31637661), yuv420p(tv, bt470bg/bt470bg/smpte170m, progressive), 1920x1080 [SAR 1:1 DAR 16:9], 4971 kb/s, 30 fps, 30 tbr, 15360 tbn (default)
    Metadata:
      handler_name    : Bento4 Video Handler
      vendor_id       : [0][0][0][0]
[STREAM]
duration=27.477000
[/STREAM]
[STREAM]
duration=29.766667

I found the fix.我找到了解决方法。 It turned out I needed to set the input video's audio stream to aresample=1=async in the filter_complex for the amix command.结果我需要在 amix 命令的 filter_complex 中将输入视频的音频 stream 设置为aresample=1=async

aresample=aysnc=1

Ultimately my amix command looked like this:最终我的 amix 命令看起来像这样:

'[0:a]aresample=async=1[0a];[1:a]volume=2.0[1a];[0a][1a]amix=inputs=2'

I found this kind of solution from a similar question over at superuser: https://superuser.com/questions/1234493/ffmpeg-amix-audio-to-video-with-some-audio-in-parts我在超级用户的类似问题中找到了这种解决方案: https://superuser.com/questions/1234493/ffmpeg-amix-audio-to-video-with-some-audio-in-parts

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM