简体   繁体   中英

Convolving Room Impulse Response with a Wav File (python)

I have written the following code which is supposed to put echo over an available sound file. Unfortunately the output is a very noisy result which I don't exactly understand. Can anybody help me with regard to this? Is there any skipped step?

#convolving a room impulse response function with a sound sample both of stereo type
from scipy.io import wavfile
    inp=wavfile.read(sound_path+sound_file_name)
    IR=wavfile.read(IR_path+IR_file_name)
    if inp[0]!=IR[0]:
        print "Size mismatch"
        sys.exit(-1)
    else:
        rate=inp[0]
    print sound_file_name
    out_0=fftconvolve(inp[1][:,1],IR[1][:,0])
    out_1=fftconvolve(inp[1][:,1],IR[1][:,1])
    in_counter+=1
    out=np.vstack((out_0,out_1)).T
    out[:inp[1].shape[0]]=out[:inp[1].shape[0]]+inp[1]
    wavfile.write(sound_path+sound_file_name+'_echoed.wav',rate,out)

Adding echo to a sound file is just that... adding echo. Your code doesn't look like it's adding two sounds together; it looks like it's transforming the input sound into something else.

Your data flow should look something like this:

source sound ------------------------------>|
      |                                     + ----------> target sound
      ---------> convolution echo --------->|

Note that your echo sound is going to be longer than your original sound (ie it has a "tail.")

Adding two sounds together is simply a matter of adding each of the individual samples together from both sounds to produce a new output wave. I don't think vstack does that.

Apparently Wav files are imported as int16 files and modification should be done after converting them to floats: http://nbviewer.ipython.org/github/mgeier/python-audio/blob/master/audio-files/audio-files-with-pysoundfile.ipynb

After convolution one needs to renormalize again. And thats it.

Hope this helps the others too.

from utility import pcm2float,float2pcm
    input_rate,input_sig=wavfile.read(sound_path+sound_file_name)
    input_sig=pcm2float(input_sig,'float32')
    IR_rate,IR_sig=wavfile.read(IR_path+IR_file_name)
    IR_sig=pcm2float(IR_sig,'float32')

    if input_rate!=IR_rate:
        print "Size mismatch"
        sys.exit(-1)
    else:
        rate=input_rate
    print sound_file_name
    con_len=-1
    out_0=fftconvolve(input_sig[:con_len,0],IR_sig[:con_len,0])
    out_0=out_0/np.max(np.abs(out_0))
    out_1=fftconvolve(input_sig[:con_len,1],IR_sig[:con_len,1])
    out_1=out_0/np.max(np.abs(out_1))
    in_counter+=1
    out=np.vstack((out_0,out_1)).T
        wavfile.write(sound_path+sound_file_name+'_'+IR_file_name+'_echoed.wav',rate,float2pcm(out,'int16'))

One can download utility from the above link.

UPDATE: Although it generates a working output its still not as good as the result when using the original website Openair for convolving.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM