简体   繁体   English

librosa 无法打开由 librosa 创建的 .wav?

[英]librosa can't open .wav created by librosa?

i'm trying to use librosa to generate some data by cutting 1s pieces from some .wav file with a duration of 60s.我正在尝试使用 librosa 通过从一些持续时间为 60 秒的 .wav 文件中剪切 1s 片段来生成一些数据。

This part works, i create all my files and i can also listen to them via any player, but if i try to open them with librosa.load i receive this error:这部分有效,我创建了所有文件,我也可以通过任何播放器收听它们,但是如果我尝试使用 librosa.load 打开它们,我会收到此错误:

>>> librosa.load('.\\train\\audio\\silence\\0doing_the_dishes.wav', sr=None)
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
 File "C:\Users\gionata\AppData\Local\Programs\Python\Python36\lib\site\packages\librosa\core\audio.py", line 107, in load
with audioread.audio_open(os.path.realpath(path)) as input_file: File "C:\Users\gionata\AppData\Local\Programs\Python\Python36\lib\site-packages\audioread\__init__.py", line 116, in audio_open 
raise NoBackendError()
audioread.NoBackendError

Do you have any suggestion?你有什么建议吗? I create the file.wav with this function:我用这个函数创建了 file.wav:

def create_silence():
    path=DB+"_background_noise_/"
    sounds = [x[len(DB):] for x in glob.glob(path+ '*.wav')]
    for elem in enumerate(sounds):
       sound=elem.split('\\')[1]
       print(sound)
       for j,i in enumerate(np.arange(0.0, 59.0, 0.3)):
           y, sr=librosa.load(DB+elem, sr=None, offset=i, duration=1.0)
           librosa.output.write_wav(DB+'silence/'+str(j)+sound, y, sr=sr, norm=False)

The problem only presents itself with file created by librosa, librosa.load has worked with other files with no problems at all.问题仅出现在由 librosa 创建的文件中,librosa.load 与其他文件一起工作,完全没有问题。

是关于ffmpeg的,如果你使用windows,你可以根据这里解决这个问题,如果你使用linux,如果可以尝试:

sudo apt-get install libav-tools
import librosa
audio_path='C:/Users/hp/name.wav' #location 
(xf, sr) = librosa.load(audio_path)

It have worked for me xf=array of sound file,sr=frequency它对我有用 xf=array of sound file,sr=frequency

I solvede this, Librosa outputs the values as they are, in my case the np.array where float32 but the standard is 16 bit for each value, so changing the type does the trick:我解决了这个问题,Librosa 按原样输出值,在我的例子中是 np.array,其中 float32 但每个值的标准是 16 位,因此更改类型可以解决问题:

def create_silence():
path=DB+"_background_noise_/"
maxv = np.iinfo(np.int16).max
sounds = [x[len(DB):] for x in glob.glob(path+ '*.wav')]
for elem in sounds:
    sound=elem.split('\\')[1]
    print(sound)
    for j,i in enumerate(np.arange(0.0, 59.0, 0.3)):
        y, fs=librosa.load(DB+elem, sr=None, offset=i, duration=1.0, mono=False)
        librosa.output.write_wav(DB+'silence/'+str(j)+sound, y=(y*maxv).astype(np.int16), sr=fs, norm=False)

I couldn't get 吴连伟's or Gionata's solution to work, however this did the trick:我无法让吴连伟或 Gionata 的解决方案起作用,但是这解决了问题:

from scipy.io import wavfile
import scipy
maxv = np.iinfo(np.int16).max
scipy.io.wavfile.write(path, sr, (y*maxv).astype(np.int16))

(Where path is the path and file name, y is the first output from librosa.load, and sr the second output from librosa.load) (其中 path 是路径和文件名,y 是 librosa.load 的第一个输出,sr 是 librosa.load 的第二个输出)

This wav-file I could load with librosa in a later stage, so it solved the problem!这个wav文件我可以在稍后阶段用librosa加载,所以它解决了问题!

libav-tools is deprecated in ubuntu, so libav-tools 在 ubuntu 中已弃用,因此

sudo apt-get install ffmpeg 

did the trick成功了

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM