简体   繁体   English

如何使用 Google Cloud Speech (V1 API) 进行语音转文本 - 需要能够正确高效地处理超过 3 小时的音频文件

[英]How to use Google Cloud Speech (V1 API) for speech to text - need to be able to process over 3 hours audio files properly and efficiently

I am looking for documentation and stuff but could not find a solution yet我正在寻找文档和东西,但还没有找到解决方案

Installed NuGet package安装的 NuGet 包

Also generated API key还生成了 API 密钥

However can't find proper documentation how to use API key但是找不到正确的文档如何使用 API 密钥

Moreover, I want to be able to upload very long audio files此外,我希望能够上传很长的音频文件

So what would be the proper way to upload up to 3 hours audio files and get their results?那么上传长达 3 小时的音频文件并获得结果的正确方法是什么?

I have 300$ budget so should be enough我有 300 美元的预算所以应该足够了

Here my so far code这是我到目前为止的代码

This code currently fails since I have not set the credentials correctly at the moment which I don't know how to此代码目前失败,因为我目前没有正确设置凭据,我不知道如何

I also have service account file ready to use我还有服务帐户文件可供使用

public partial class MainWindow : Window
{
    public MainWindow()
    {
        InitializeComponent();
    }

    private void Button_Click(object sender, RoutedEventArgs e)
    {
        var speech = SpeechClient.Create();           
        
        var config = new RecognitionConfig
        {               
            Encoding = RecognitionConfig.Types.AudioEncoding.Flac,
            SampleRateHertz = 48000,
            LanguageCode = LanguageCodes.English.UnitedStates
        };
        var audio = RecognitionAudio.FromStorageUri("1m.flac");

        var response = speech.Recognize(config, audio);

        foreach (var result in response.Results)
        {
            foreach (var alternative in result.Alternatives)
            {
                Debug.WriteLine(alternative.Transcript);
            }
        }
    }
}

在此处输入图片说明

I don't want to set environment variable.我不想设置环境变量。 I have both API key and Service Account json file.我有 API 密钥和服务帐户 json 文件。 How can I manually set?如何手动设置?

You need to use the SpeechClientBuilder to create a SpeechClient with custom credentials, if you don't want to use the environment variable.如果您不想使用环境变量,则需要使用SpeechClientBuilder创建带有自定义凭据的SpeechClient Assuming you've got a service account file somewhere, change this:假设您在某处有一个服务帐户文件,请更改以下内容:

var speech = SpeechClient.Create();

to this:对此:

var speech = new SpeechClientBuilder
{
    CredentialsPath = "/path/to/your/file"
}.Build();

Note that to perform a long-running recognition operation, you should also use the LongRunningRecognize method - I strongly suspect your current RPC will fail, either explicitly because it's trying to run on a file that's too large, or it'll just time out.请注意,要执行长时间运行的识别操作,您还应该使用LongRunningRecognize方法 - 我强烈怀疑您当前的 RPC 会失败,这可能是因为它试图在太大的文件上运行,或者只是超时。

You need to set the environment variable before create the instance of Speech:在创建 Speech 实例之前,您需要设置环境变量:

 Environment.SetEnvironmentVariable("GOOGLE_APPLICATION_CREDENTIALS", "text-tospeech.json");

Where the second param (text-tospeech.json) is your file with credentials generated by Google Api.其中第二个参数 (text-tospeech.json) 是您的文件,其中包含由 Google Api 生成的凭据。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM