简体   繁体   中英

How to perform realtime speech recognition using google speech recognition

I've a question regarding a problem I'm facing with my project. It is supposed to communicate with use via speech. I'm using google speech api to give commands to the system. It takes sometime to process the command and then responds back.The problem is, it takes longer pause than expected, (6-8 seconds) and then proceeds to answers.

For my program, I need realtime speech recognition so the system responds as soon as I complete the question. My question is, is there anyway to send each word to API as it is spoken, rather than sending the whole sentence after it is completed. My code is below:

import speech_recognition as sr

# obtain audio from the microphone
r = sr.Recognizer()
with sr.Microphone() as source:
    print("Say something!")
    audio = r.listen(source)

try:
    print("You said " + r.recognize_google(audio))
except sr.UnknownValueError:
    print("Ooops! Could not understand audio")
except sr.RequestError as e:
    print("Could not request results from Google Speech Recognition service; {0}".format(e))

I'm a student and doing an academic project. Any help is highly appreciated. Thank you very much.

You could set the interimResults parameter to True https://cloud.google.com/speech-to-text/docs/basics .

If you're looking for an environment you can clone and get started with the Speech API, you can check the realtime-transcription-playground repository.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM