简体   繁体   English

FFMPEG-编码后获取准确的计算音频文件大小

[英]FFMPEG - Get the exact calculated audio filesize after encode

Im trying to guess an audio (mp3) filesize before encode with ffmpeg, afterward, need to have the exact calculated filesize. 我试图在使用ffmpeg编码之前猜测音频(mp3)文件大小,然后需要精确计算出文件大小。

Here is the formula im using to predict and calculate the filesize (hope im not wrong) : 这是im用于预测和计算文件大小的公式(希望im没错):

( Bitrates x Duration ) / 8) x 1000 = Filesize in Bytes. (比特率x持续时间)/ 8)x 1000 =文件大小(以字节为单位)。

Im going to give a real example so that everyone can understand the use case. 我将举一个真实的例子,以便每个人都能理解用例。

Example : 范例:

Having an m4a file with the following data : 具有一个包含以下数据的m4a文件:

  • Name : Assuming xxx.m4a 名称:假设xxx.m4a
  • Filesize : 8 304 014 bytes (8,3 Mo) 文件大小:8304014字节(8.3 Mo)
  • Bitrates : 256k 比特率:256k
  • Duration : 260 seconds 持续时间:260秒

Expected filesize : ( (256 x 260) / 8 ) x 1000 = 8 320 000 bytes 预期的文件大小:((256 x 260)/ 8)x 1000 = 8320 000字节

Then im running the following ffmpeg command : 然后,我运行以下ffmpeg命令:

ffmpeg -i xxx.m4a -f mp3 -y -minrate 256k -maxrate 256k -bufsize 256k -b:a 256k -fs 8320000 output.mp3

Console output : 控制台输出:

ffmpeg version 2.7.2 Copyright (c) 2000-2015 the FFmpeg developers
  built with Apple LLVM version 6.1.0 (clang-602.0.53) (based on LLVM 3.6.0svn)
  configuration: --prefix=/usr/local/Cellar/ffmpeg/2.7.2_1 --enable-shared --enable-pthreads --enable-gpl --enable-version3 --enable-hardcoded-tables --enable-avresample --cc=clang     --host-cflags= --host-ldflags= --enable-opencl --enable-libx264 --enable-libmp3lame --enable-libvo-aacenc --enable-libxvid --enable-vda
  libavutil      54. 27.100 / 54. 27.100
  libavcodec     56. 41.100 / 56. 41.100
  libavformat    56. 36.100 / 56. 36.100
  libavdevice    56.  4.100 / 56.  4.100
  libavfilter     5. 16.101 /  5. 16.101
  libavresample   2.  1.  0 /  2.  1.  0
  libswscale      3.  1.101 /  3.  1.101
  libswresample   1.  2.100 /  1.  2.100
  libpostproc    53.  3.100 / 53.  3.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'xxx.m4a':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2mp41
    encoder         : Lavf56.36.100
  Duration: 00:04:20.53, start: 0.000000, bitrate: 254 kb/s
    Stream #0:0(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 253 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
Output #0, mp3, to 'output.mp3':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2mp41
    TSSE            : Lavf56.36.100
    Stream #0:0(und): Audio: mp3 (libmp3lame), 44100 Hz, stereo, fltp, 256 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
      encoder         : Lavc56.41.100 libmp3lame
Stream mapping:
  Stream #0:0 -> #0:0 (aac (native) -> mp3 (libmp3lame))
Press [q] to stop, [?] for help
size=    8127kB time=00:04:20.02 bitrate= 256.1kbits/s    
video:0kB audio:8127kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.011765%    

Problem and Questions : 问题与疑问:

  • Can you tell me why im I getting an output with 8 322 546 bytes and not 8 320 000 as expected ? 您能告诉我为什么我得到的输出为8 322 546字节而不是预期的8 320 000吗?
  • Is there something wrong in my formula or the ffmpeg command ? 我的公式或ffmpeg命令有问题吗?
  • What solution can you suggest to get the exact predicted filesize ? 您可以建议采用什么解决方案来获得准确的预测文件大小?

Thank you in advance. 先感谢您。

Besides the muxing overhead inherent in the container, MP3 audio is stored in frames. 除了容器固有的混合开销之外,MP3音频还存储在帧中。 And each frame has fixed number of 1152 samples. 每帧具有固定数量的1152个样本。 The encoder will output full frames so for an output sampling rate of 44100, the closest to 260 seconds is 编码器将输出全帧,因此对于44100的输出采样率,最接近260秒是

ceiling of (260 x 44100/1152) = 9954 frames = ~260.02285 seconds. (260 x 44100/1152)的上限= 9954帧=〜260.02285秒。

This throws your calculation, by itself, off balance, even if the encoding assumptions were right. 即使编码假设正确,这也会使您自己的计算失去平衡。

Even then, the bit reservoir may come into play. 即使那样, 钻头储存器也可能起作用。

Edit : 编辑

You can drop the bitrate and add silent padding, but this too isn't precise as muxing overhead comes into play 您可以降低比特率并添加静音填充,但这也不是很精确,因为会产生混合开销

ffmpeg -i xxx.m4a -f lavfi -t 5 -i anullsrc -lavfi "[0:a][1:a]concat=n=2:v=0:a=1" -f mp3 -y -minrate 224k -maxrate 224k -bufsize 224k -b:a 224k -fs N output.mp3

Here, the fs should be calculated as per MP3 + 5 seconds duration. 在此,应根据MP3 + 5秒的持续时间来计算fs

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM