简体繁体 English

如何检测Google Cloud Platform机器学习语音api中说的语言

[英]how to detect language spoken in google cloud platform machine learning speech api

原文 2017-05-23 08:07:06 0 2 machine-learning/ google-cloud-platform/ speech-to-text

Is there an option to automatically detect the spoken language using Google Cloud Platform Machine Learning's Speech API? 是否可以使用Google Cloud Platform Machine Learning的语音API自动检测口语？

https://cloud.google.com/speech/docs/languages indicates the list of the languages supported and user needs to be manually set this parameter to perform speech-to-text. https://cloud.google.com/speech/docs/languages指示受支持的语言列表，需要手动设置用户此参数以执行语音转文本。

Thanks Mahesh 谢谢马赫什

2 个解决方案

As of last month, Google added support for detection of spoken languages into its speech-to-text API. 截至上个月，Google在其语音到文本API中增加了对口语检测的支持。 Google Cloud Speech v1p1beta1 Google Cloud Speech v1p1beta1

It's a bit limited though - you have to provide a list of probable language codes, up to 3 of them only, and it's said to be supported only for voice command and voice search modes. 不过，它有一点局限性-您必须提供可能的语言代码列表，最多只能提供3种，并且据说只有语音命令和语音搜索模式才支持。 It's useful if you have a clue what other languages may be in your audio. 如果您知道音频中可能还有其他语言，这将很有用。

From their docs: 从他们的文档：

alternative_language_codes[]: string Alternative_language_codes []：字符串

Optional A list of up to 3 additional BCP-47 language tags, listing possible alternative languages of the supplied audio. 可选最多3个其他BCP-47语言标签的列表，列出所提供音频的可能替代语言。 See Language Support for a list of the currently supported language codes. 请参阅语言支持以获取当前支持的语言代码列表。 If alternative languages are listed, recognition result will contain recognition in the most likely language detected including the main language_code. 如果列出了其他语言，则识别结果将包含对检测到的最有可能的语言（包括主要language_code）的识别。 The recognition result will include the language tag of the language detected in the audio. 识别结果将包括在音频中检测到的语言的语言标签。 NOTE: This feature is only supported for Voice Command and Voice Search use cases and performance may vary for other use cases (eg, phone call transcription).” 注意：此功能仅在“语音命令”和“语音搜索”用例中受支持，性能可能因其他用例（例如，电话转录）而有所不同。”

Requests to Google Cloud Speech API require the following configuration parameters: encoding , sampleRateHertz and languageCode . 向Google Cloud Speech API的请求需要以下配置参数： encoding ， sampleRateHertz和languageCode 。 https://cloud.google.com/speech/reference/rest/v1/RecognitionConfig https://cloud.google.com/speech/reference/rest/v1/RecognitionConfig

Thus, it is not possible for the Google Cloud Speech API service to automatically detect the language used. 因此，Google Cloud Speech API服务无法自动检测所使用的语言。 The service will be configured by this parameter ( languageCode ) to start recognizing speech in that specific language. 该服务将通过此参数（ languageCode ）配置为开始识别该特定语言的语音。

If you had in mind a parallel with Google Cloud Translation API, where the input language is automatically detected, please consider that automatically detecting the language used in an audio file requires much more bandwidth, storage space and processing power than in a text file. 如果您想与自动检测输入语言的Google Cloud Translation API并行使用，请考虑自动检测音频文件中使用的语言比文本文件需要更多的带宽，存储空间和处理能力。 Also, Google Cloud Speech API offers Streaming Speech Recognition, a real-time speech-to-text service, where the languageCode parameter is especially required. 此外，Google Cloud Speech API提供了流语音识别，这是一种实时语音转文本服务，其中，特别需要languageCode参数。