Is there a way to read/write a MP3 audio file into/from a numpy
array with a similar API to scipy.io.wavfile.read and scipy.io.wavfile.write :
sr, x = wavfile.read('test.wav')
wavfile.write('test2.wav', sr, x)
?
Note: pydub
's AudioSegment
object doesn't give direct access to a numpy array.
PS: I have already read Importing sound files into Python as NumPy arrays (alternatives to audiolab) , tried all the answers, including those which requires to Popen
ffmpeg and read the content from stdout pipe, etc. I have also read Trying to convert an mp3 file to a Numpy Array, and ffmpeg just hangs , etc., and tried the main answers, but there was no simple solution. After spending hours on this, I'm posting it here with "Answer your own question – share your knowledge, Q&A-style". I have also read How to create a numpy array from a pydub AudioSegment? but this does not easily cover the multi channel case, etc.
Calling ffmpeg
and manually parsing its stdout
as suggested in many posts about reading a MP3 is a tedious task (many corner cases because different number of channels are possible, etc.), so here is a working solution using pydub
(you need to pip install pydub
first).
This code allows to read a MP3 to a numpy array / write a numpy array to a MP3 file with a similar API than scipy.io.wavfile.read/write
:
import pydub
import numpy as np
def read(f, normalized=False):
"""MP3 to numpy array"""
a = pydub.AudioSegment.from_mp3(f)
y = np.array(a.get_array_of_samples())
if a.channels == 2:
y = y.reshape((-1, 2))
if normalized:
return a.frame_rate, np.float32(y) / 2**15
else:
return a.frame_rate, y
def write(f, sr, x, normalized=False):
"""numpy array to MP3"""
channels = 2 if (x.ndim == 2 and x.shape[1] == 2) else 1
if normalized: # normalized array - each item should be a float in [-1, 1)
y = np.int16(x * 2 ** 15)
else:
y = np.int16(x)
song = pydub.AudioSegment(y.tobytes(), frame_rate=sr, sample_width=2, channels=channels)
song.export(f, format="mp3", bitrate="320k")
Notes:
normalized=True
allows to work with a float array (each item in [-1,1)) Usage example:
sr, x = read('test.mp3')
print(x)
#[[-225 707]
# [-234 782]
# [-205 755]
# ...,
# [ 303 89]
# [ 337 69]
# [ 274 89]]
write('out2.mp3', sr, x)
You can use audio2numpy library. Install with
pip install audio2numpy
Then, your code would be:
import audio2numpy as a2n
x,sr=a2n.audio_from_file("test.mp3")
For writing, use @Basj 's answer
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.