简体   繁体   中英

Soundfile imports audio in two different formats

I am attempting to preprocess audiofiles to be used in a neural net with soundfile.read() , but the function is formatting the returned data differently for different.FLAC files with the same sample rate and length. For example, calling data, sr = soundfile.read(audiofile1) produced an array with shape data.shape = (48000, 2) (where individual element values were either the amplitude, 0, or the negative amplitude in NumPy float64), while calling data, sr = soundfile.read(audiofile2) produced an array with shape data.shape = (48000,) (where individual element values were varied NumPy float64).

Also, if it helps, audiofile1 was a recording taken from a recording taken via PyAudio, whereas audiofile2 was a sample from the LibriSpeech corpus.

So, my question is twofold:

Why is soundfile.read() producing two different data formats, and how do I ensure that the function returns the arrays in the same format in the future?

Your audiofile2 sample is mono, whereas your audiofile1 recording is stereo (ie you probably recorded it with a PyAudio stream configured with channels=2 ). So I suggest you first figure out whether you need mono or stereo for your application.

If all you really care is a mono audio signal, you can convert stereo (or more generally N-channel) audio to mono by averaging the channels:

data, sr = soundfile.read(audiofile)
if np.dim(data)>1:
  data = np.mean(data,axis=1)

If you need stereo audio, then you may create an additional channel by duplicating the one you have (although that would not be adding the usual additional information such as phase or amplitude differences between the different channels) with:

if np.dim(data)<2:
  data = np.tile(data,(2,1)).transpose()

It's as simple as:

data, sr = soundfile.read(audiofile2, always_2d=True)

With this, data.shape will always have two elements; data.shape[0] will be the number of frames and data.shape[1] will be the number of channels.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM