How to perform realtime speech recognition using google speech recognition

Question

I've a question regarding a problem I'm facing with my project. It is supposed to communicate with use via speech. I'm using google speech api to give commands to the system. It takes sometime to process the command and then responds back.The problem is, it takes longer pause than expected, (6-8 seconds) and then proceeds to answers.

For my program, I need realtime speech recognition so the system responds as soon as I complete the question. My question is, is there anyway to send each word to API as it is spoken, rather than sending the whole sentence after it is completed. My code is below:

import speech_recognition as sr

# obtain audio from the microphone
r = sr.Recognizer()
with sr.Microphone() as source:
    print("Say something!")
    audio = r.listen(source)

try:
    print("You said " + r.recognize_google(audio))
except sr.UnknownValueError:
    print("Ooops! Could not understand audio")
except sr.RequestError as e:
    print("Could not request results from Google Speech Recognition service; {0}".format(e))

I'm a student and doing an academic project. Any help is highly appreciated. Thank you very much.

Answer 1

You could set the interimResults parameter to True https://cloud.google.com/speech-to-text/docs/basics .

If you're looking for an environment you can clone and get started with the Speech API, you can check the realtime-transcription-playground repository.

How to perform realtime speech recognition using google speech recognition

Question

1 answers

solution1
0 2021-07-08 07:05:33

How to perform realtime speech recognition using google speech recognition

Question

1 answers

solution1 0 2021-07-08 07:05:33

solution1
0 2021-07-08 07:05:33