简体   繁体   English

Raspberry Pi将Pyaudio Wav转换为Flac 48000hz + Google Speech

[英]Raspberry Pi convert Pyaudio Wav to Flac 48000hz + Google Speech

I am facing following problem: 我面临以下问题:

I recorded sound with Pyaudio and saved it as Wav. 我用Pyaudio录制了声音,并将其保存为Wav。 The Wav file is 48000hz (No other Rate works (sampling rate error but thats an other story)) The Wav file sounds good , now i want to convert the wav to flac to sent it to the google speech api. Wav文件为48000hz(没有其他速率起作用(采样率错误,但那是另一个故事))。Wav文件听起来不错,现在我想将wav转换为flac并将其发送到Google语音api。

Problem is avconf converts my 48khz input wav to an 8khz flac(with -ar 48000). 问题是avconf将我的48khz输入wav转换为8khz flac(带有-ar 48000)。 The flac file is just white noise , i have tried verry much but even google has no answer ;) flac文件只是白噪声,我已经尝试了很多,但即使Google也没有答案;)

Note:it worked for me fine with an other microphone with 16Khz no problems at all. 注意:与其他频率为16Khz的麦克风配合使用,对我来说还算不错。 Neither with Pyaudios Sampling rate error nor the avconv problem. Pyaudios采样率错误或avconv问题均不存在。

Here is The code: 这是代码:

Recording: 记录:

   chunk = 2048
FORMAT = pyaudio.paInt16
CHANNELS = 1
RATE = 48000
THRESHOLD = 525 #The threshold intensity that defines silence signal (lower than).
SILENCE_LIMIT = 3 #Silence limit in seconds. The max ammount of seconds where only silence is recorded. When this time passes the recording finishes and the file is delivered.

#open stream
p = pyaudio.PyAudio()

stream = p.open(format = FORMAT,
                channels = CHANNELS,
                rate = RATE,
                input = True,
                frames_per_buffer = chunk)

print "* listening. CTRL+C to finish manually."
all_m = []
data = ''
rel = RATE/chunk
slid_win = deque(maxlen=SILENCE_LIMIT*rel)
started = False

while (True):
    data = stream.read(chunk)
    slid_win.append (abs(audioop.avg(data, 2)))

    if(True in [ x>THRESHOLD for x in slid_win]):
        if(not started):
            print "starting record"
        started = True
        all_m.append(data)
    elif (started==True):
        print "finished"
        #the limit was reached, finish capture and deliver
        filename = save_speech(all_m,p)
        result=stt_google_wav(filename)
        #reset all
        started = False
        #slid_win = deque(maxlen=SILENCE_LIMIT*rel)
        #all_m= []
        print "Google STT Done"
        stream.close()
        p.terminate()
        return result

AND: 和:

def save_speech(data, p):
filename = 'output_'+str(int(time.time()))
# write data to WAVE file
data = ''.join(data)
wf = wave.open(filename+'.wav', 'wb')
wf.setnchannels(1)
wf.setsampwidth(p.get_sample_size(pyaudio.paInt16))
wf.setframerate(48000)
wf.writeframes(data)
wf.close()
print "finished saving wav: %s" % filename
return filename

To Convert to Flac: 转换为Flac:

os.system("avconv -i "+ filename+".wav  -y -ar 48000 "+ filename+ ".flac")

EDIT 1: 编辑1:

The Flac is actually 48khz , i dont know why mplayer shows me that the flac is 8khz , i played it on my pc and the flac is perfect, anyway the google api seems to have problems with that , because it returns nothing. Flac实际上是48khz,我不知道为什么mplayer向我显示flac是8khz,我在我的PC上播放了它,并且flac非常完美,无论如何google api似乎都存在问题,因为它什么也不返回。 I assume that the white noise problem of the mplayer on the Rasberry is connected to the Problem with the google Api but i have no idea what it could be. 我假设Rasberry上的mplayer的白噪声问题与Google Api的问题有关,但我不知道这可能是什么。

Wav File: Wav文件:

output_1385413929.wav: RIFF (little-endian) data, WAVE audio, Microsoft PCM, 16 bit, mono 48000 Hz

Flac File: Flac文件:

output_1385413929.flac: FLAC audio bitstream data, 16 bit, mono, 48 kHz, 204800 samples

Solved: I dont know why , i turned on my pi and wanted to test around and suddenly It worked without changing anything. 解决了:我不知道为什么,我打开pi,想测试一下,突然间它没有任何改变就可以工作了。

Ty for your help. 泰语为您提供帮助。 Greetings from germany, Flo 来自德国,Flo的问候

I agree - works down the line for me: 我同意-为我努力:

me@raspberrypi /mnt/share/Audio/xxxxxx $ file sample_audio.wav 
sample_audio.wav: RIFF (little-endian) data, WAVE audio, Microsoft PCM, 16 bit, stereo 8000 Hz
me@raspberrypi /mnt/share/Audio/xxxxxx $ file sample_audio.flac 
sample_audio.flac: FLAC audio bitstream data, 16 bit, stereo, 48 kHz, 9131406 samples

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM