如何使用谷歌语音识别进行实时语音识别

Question

我有一个关于我的项目面临的问题的问题。 它应该通过语音与使用进行交流。 我正在使用谷歌语音 api 向系统发出命令。 需要一些时间来处理命令然后回复。问题是，它需要比预期更长的暂停时间（6-8 秒），然后继续回答。

对于我的程序，我需要实时语音识别，以便系统在我完成问题后立即做出响应。 我的问题是，是否有任何方式将每个单词在说出时发送到 API，而不是在完成后发送整个句子。 我的代码如下：

import speech_recognition as sr

# obtain audio from the microphone
r = sr.Recognizer()
with sr.Microphone() as source:
    print("Say something!")
    audio = r.listen(source)

try:
    print("You said " + r.recognize_google(audio))
except sr.UnknownValueError:
    print("Ooops! Could not understand audio")
except sr.RequestError as e:
    print("Could not request results from Google Speech Recognition service; {0}".format(e))

我是一名学生，正在做一个学术项目。 任何帮助都受到高度赞赏。 非常感谢。

Answer 1

您可以将interimResults参数设置为 True https://cloud.google.com/speech-to-text/docs/basics 。

如果您正在寻找可以克隆并开始使用 Speech API 的环境，您可以查看realtime-transcription-playground存储库。

如何使用谷歌语音识别进行实时语音识别

问题描述

1 个解决方案

解决方案1
0 2021-07-08 07:05:33

如何使用谷歌语音识别进行实时语音识别

问题描述

1 个解决方案

解决方案1 0 2021-07-08 07:05:33

解决方案1
0 2021-07-08 07:05:33