简体   繁体   English

为什么libmp3lame会在MP3的开头添加零?

[英]Why does libmp3lame add zeros to the start of the MP3?

I have a uncompressed .wav file that I turn into a 96k MP3 file: 我有一个未压缩的.wav文件,我变成了96k的MP3文件:

ffmpeg.exe -i song.wav -vn -b:a 96000 -ac 2 -ar 48000 -acodec libmp3lame -y song.mp3

The input file has 637386 samples. 输入文件有637386个样本。 The output has 639360 samples. 输出有639360个样本。 The extra samples in the MP3 are all zeros at the beginning of the file. MP3中的额外样本在文件开头都是零。 This happens in every file I've translated and with more codecs than just libmp3lame. 这种情况发生在我翻译的每个文件中,并且编解码器比libmp3lame更多。 Is this an FFMPEG bug or a codec bug? 这是一个FFMPEG错误还是编解码器错误? Why are these added? 为什么这些被添加? Is there a way to stop them from being added? 有没有办法阻止它们被添加?

Edit: Simplified example and console output: 编辑:简化示例和控制台输出:

ffmpeg.exe -i song.wav -y song.mp3

ffmpeg version N-55796-gb74213d Copyright (c) 2000-2013 the FFmpeg developers
  built on Aug 26 2013 19:43:51 with gcc 4.7.3 (GCC)
  configuration: --enable-gpl --enable-version3 --disable-w32threads --enable-avisynth --enable-bzlib --enable-fontconfig --enable-frei0r --enable-gnutls --enable-iconv --enable-libass --enable-libbluray --enable-libcaca --enable-libfreetype --enable-libgsm --enable-libilbc --enable-libmodplug --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-librtmp --enable-libschroedinger --enable-libsoxr --enable-libspeex --enable-libtheora --enable-libtwolame --enable-libvo-aacenc --enable-libvo-amrwbenc --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libxavs --enable-libxvid --enable-zlib
  libavutil      52. 42.100 / 52. 42.100
  libavcodec     55. 29.100 / 55. 29.100
  libavformat    55. 14.102 / 55. 14.102
  libavdevice    55.  3.100 / 55.  3.100
  libavfilter     3. 82.102 /  3. 82.102
  libswscale      2.  5.100 /  2.  5.100
  libswresample   0. 17.103 /  0. 17.103
  libpostproc    52.  3.100 / 52.  3.100
Guessed Channel Layout for  Input Stream #0.0 : stereo
Input #0, wav, from 'song.wav':
  Duration: 00:00:13.28, bitrate: 1538 kb/s
    Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 48000 Hz, stereo, s16, 1536 kb/s
Output #0, mp3, to 'song.mp3':
  Metadata:
    TSSE            : Lavf55.14.102
    Stream #0:0: Audio: mp3 (libmp3lame), 48000 Hz, stereo, s16p
Stream mapping:
  Stream #0:0 -> #0:0 (pcm_s16le -> libmp3lame)
Press [q] to stop, [?] for help
size=     208kB time=00:00:13.29 bitrate= 128.4kbits/s
video:0kB audio:208kB subtitle:0 global headers:0kB muxing overhead 0.111205%

Number of samples in wav: 637386 wav中的样本数:637386

Number of samples in mp3: 639984 mp3中的样本数:639984

The amount of delay added by LAME in FFmpeg is LAME在FFmpeg中添加的延迟量是

avctx->initial_padding = lame_get_encoder_delay(s->gfp) + 528 + 1;

From the FAQ of the LAME project: 来自LAME项目的常见问题解答

2. Why does LAME add silence to the beginning each song? 2.为什么LAME会在每首歌的开头增加沉默?

DECODER DELAY AT START OF FILE: 文件开头的解码器延迟:

All decoders I have tested introduce a delay of 528 samples. 我测试的所有解码器都引入了528个样本的延迟。 That is, after decoding an mp3 file, the output will have 528 samples of 0's appended to the front. 也就是说,在解码mp3文件之后,输出将在前面附加528个0的样本。 This is because the standard MDCT/filterbank routines used by the ISO have a 528 sample delay. 这是因为ISO使用的标准MDCT /滤波器组例程具有528个采样延迟。 It would be possible to write a MDCT/filterbank routine with a 0 sample delay (see description of Takehiro's MDCT/filterbank routine used in LAME encoding below) but I dont know that anyone has done this. 有可能编写一个带有0个采样延迟的MDCT / filterbank例程(参见下面LAME编码中使用的Takehiro的MDCT / filterbank例程的说明),但我不知道有人这样做了。 Furthermore, because of the overlapped nature of MDCT frames, the first half of the first granule (1 granule=576 samples) doesn't have a previous frame to overlap with, resulting in attenuation of the first N samples. 此外,由于MDCT帧的重叠性质,第一颗粒的前半部分(1个颗粒= 576个样本)没有先前的帧与之重叠,导致前N个样本的衰减。 The value of N depends on the window type. N的值取决于窗口类型。 For "STOP_TYPE" and "SHORT_TYPE", N=96, while for "START_TYPE" and "NORMAL_TYPE", N=288. 对于“STOP_TYPE”和“SHORT_TYPE”,N = 96,而对于“START_TYPE”和“NORMAL_TYPE”,N = 288。 The first frame produced by LAME 3.56 and up will always be of STOP_TYPE or SHORT_TYPE. LAME 3.56及更高版本生成的第一帧将始终为STOP_TYPE或SHORT_TYPE。

ENCODER DELAY AT START OF FILE: 编码开始时的编码器延迟:

ISO based encoders (BladeEnc, 8hz-mp3, etc) use a MDCT/filterbank routine similar to the one used in decoding, and thus also introduce their own 528 sample delay. 基于ISO的编码器(BladeEnc,8hz-mp3等)使用类似于解码中使用的MDCT /滤波器组例程,因此也引入了他们自己的528采样延迟。 A .wav file encoded & decoded will have a 1056 sample delay (1056 samples will be appended to the beginning). 编码和解码的.wav文件将具有1056个样本延迟(1056个样本将附加到开头)。

The discrepancy as per the FAQ isn't the same as in your output, probably because of technical nuances that I don't know of, but it's not a bug. 根据常见问题解答的差异与输出中的差异不同,可能是因为我不知道的技术细微差别,但这不是错误。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM