简体   繁体   中英

Is it possible to stitch MP3 frames together?

I'm working with CBR, no bit reservoir, 192k bitrate, and 48k sample rate MP3 files.

CBR + 192k bitrate + 48k sample rate gives a clean 576 bytes per frame.

No bit reservoir is to make each frame independent.

The reason I want to stitch them is that I want to stream the MP3 (chunk by chunk).

Therefore I need to decode each chunk into PCM for playback.

When stitching the raw PCM data of the decoded MP3 together, I can hear a click/glitch/silence/something between each chunk on playback.

How can I stream MP3 perfectly without any click, considering my constraints (only CBR, no bit reservoir, etc)? Is it even possible?

I don't think you can cut and concatenate MP3 frames naively. The Inverse Modified Discrete Cosine Transform (IMDCT) which is part of the decoding process - has different windowing modes. The windowing mode is signaled within each MP3 frame. In at least one windowing mode the IMDCT is reusing values from the previous MP3 frame. This means - you need to decode the previous frame to decode the current frame correctly.

Lets assume you have packets from file a and file b and you like to play:

a1 a2 a3 b6 b7 b8

to decode b6 correctly - you need to decode b5 and then throw away the PCM samples of b5. So at the cut you have to prime the decoder with b5 without playing b5.

a1 a2 a3 [b5] b6 b7 b8

You could send your player an additional packet at the cuts and then signal the player to discard the primming samples of the decoded additional packets.

Without any hiccups or clicks, MP3 files can be streamed, but this may call for some additional audio processing and buffering.

Using a cross-fading method to seamlessly switch between audio segments is one option. In order to achieve a smooth transition, the last few milliseconds of the audio from one chunk can be blended with the initial few milliseconds of the audio from the next piece. Using a buffer to temporarily store a tiny quantity of audio data before playback is another option. As the following block of audio data is decoded and ready for playback, the audio playback can continue during that process. Both of these methods can be accomplished by utilizing a library for decoding and playing audio, such as ffmpeg and ffplay. Here is an illustration of how to stream an MP3 file seamlessly and without any clicks or hiccups using ffmpeg:

Step 1: Decode the MP3 file into PCM format using ffmpeg.

ffmpeg -i input.mp3 -f s16le -ar 48000 -ac 2 - |

Step 2: Use ffmpeg to apply crossfading to the PCM data.

ffmpeg -f s16le -ar 48000 -ac 2 -i - -af "afade=t=out:st=5:d=2, afade=t=in:st=0:d=2" -f s16le -ar 48000 -ac 2 - |

The aforementioned command creates a 2-second crossfade at the beginning and end of each audio chunk, helping to get rid of any clicks or hiccups during the transition.

Step 3: Play the PCM data with ffplay.

ffplay -f s16le -ar 48000 -ac 2 -

This will stream the audio with the crossfading effect applied without clicks or glitches.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM