简体   繁体   English

Google Speech API v2结果为空

[英]Google Speech API v2 result is blank

I want to use Google Speech API in my current project. 我想在我当前的项目中使用Google Speech API。

I got my information about how to access the api from here 我从这里获得了有关如何访问api的信息

As described on github, you have to send a post webrequest to the server and get back a result as json. 正如在github上所描述的那样,你必须向服务器发送一个post webrequest并以json的形式返回结果。

I also got some source code used for the v1 api from here 我也从这里获得了一些用于v1 api的源代码

Setting up the request is not that hard: 设置请求并不难:

WebRequest request = WebRequest.Create(Constants.GoogleRequestString);
            request.Method = "POST";
            request.ContentType = "audio/x-flac; rate=" + sampleRate;
            request.ContentLength = bytes.Length;

Where in my example the Constants.GoogleRequestString equals to https://www.google.com/speech-api/v2/recognize?output=json&lang=en-us&key=AIzaSyCnl6MRydhw_5fLXIdASxkLJzcJh5iX0M4 在我的示例中, Constants.GoogleRequestString等于https://www.google.com/speech-api/v2/recognize?output=json&lang=en-us&key=AIzaSyCnl6MRydhw_5fLXIdASxkLJzcJh5iX0M4

I downloaded the .flac files from the github link and wrote a little program in c# which is loading the bytes of the flac file and sending it to the server with the slightly modified method GoogleRequest(byte[] bytes, int sampleRate) 我从github链接下载了.flac文件并在c#中编写了一个小程序,它正在加载flac文件的字节并使用略微修改的方法GoogleRequest(byte[] bytes, int sampleRate)将其发送到服务器

I open the stream as shown in the method, and send all bytes to the server. 我打开流,如方法所示,并将所有字节发送到服务器。 I get the response but 我得到了答复但是

The JSON String I get is: "{\\"result\\":[]}" 我得到的JSON字符串是: "{\\"result\\":[]}"

I have no idea why it is not working. 我不知道它为什么不起作用。 Either the file, or spoken text in the file is not correct (but if I listen to it with vlc I clearly hear the spoken text) or my program still has some bugs. 文件或文件中的语音文本不正确(但如果我用vlc听它,我清楚地听到说出的文字)或者我的程序仍有一些错误。

Have you ever encountered the problem to get no result by the speech-api? 你有没有遇到过这个问题,没有得到语音api的结果? Should't it say something like result: couldn't understand what is spoken or any other error message? 难道它不应该说result: couldn't understand what is spoken或任何其他错误信息?

I just tried out the .wav file. 我刚尝试了.wav文件。 This worked for me. 这对我有用。

Your code is fine assuming it resembles this: 你的代码很好,假设它类似于:

var uriBuilder = new UriBuilder(
    "https",
    "www.google.com",
    443,
    "speech-api/v2/recognize",
    "?output=json&lang=en-us&key=YOURAPIKEY");
int sampleRate = 44100;

using (var stream = File.Open("c:\\tmp\\g2.flac", FileMode.Open))
{

    HttpWebRequest request = (HttpWebRequest) WebRequest.Create(uriBuilder.Uri);
    request.Method = "POST";
    request.ContentType = "audio/x-flac; rate=" + sampleRate;
    request.AutomaticDecompression = DecompressionMethods.GZip;

    stream.CopyTo(request.GetRequestStream());
    try
    {
        using (var resp = request.GetResponse().GetResponseStream())
        {
            using (var sr = new StreamReader(resp))
            {
                Debug.WriteLine(sr.ReadToEnd());
            }
        }
    }
    catch(WebException ee)
    {
        var all = new StreamReader(ee.Response.GetResponseStream()).ReadToEnd();
        Debug.WriteLine(all);
    }
}

What is important though is the exact format of the FLAC file. 但重要的是FLAC文件的确切格式。 I used Audacity to control how my audio track would be saved. 我使用Audacity来控制音轨的保存方式。

After recording I changed the track settings to: 录制后我将曲目设置更改为:

  • Mono
  • Sample Format: 16-Bit PCM 样本格式:16位PCM
  • Rate: 44100 Hz 速率:44100赫兹

The following screenshot shows those settings: 以下屏幕截图显示了这些设置:

大胆设置

With the default stereo track and 32-bit float sample format I couldn't get the Speech API to produce any other result then the empty json payload you also got. 使用默认的立体声轨道和32位浮点样本格式,我无法使用语音API生成任何其他结果,然后您还获得了空的json有效负载。

With the above settings my result is: 使用上述设置,我的结果是:

{
    "result" : []
}{
    "result" : [{
            "alternative" : [{
                    "transcript" : "translate this",
                    "confidence" : 0.92849225
                }, {
                    "transcript" : "translate days"
                }, {
                    "transcript" : "translate dish"
                }, {
                    "transcript" : "translate fish"
                }, {
                    "transcript" : "translate these"
                }
            ],
            "final" : true
        }
    ],
    "result_index" : 0
}

My English pronunciation isn't very good as Google thinks I want to translate fish ... 谷歌认为我想要翻译鱼,我的英语发音不是很好......

If you get an http error (like 403 Forbidden) the exception handler tries to read the full response from the http body. 如果遇到http错误(如403 Forbidden),异常处理程序会尝试从http正文中读取完整的响应。 If your authentication key is incorrect it will tell you that. 如果您的身份验证密钥不正确,它会告诉您。

To get your api-keys to work with the Speech API follow the instructions here 要使您的api-key能够使用Speech API,请按照此处的说明操作

Make sure you are a member of chromium-dev@chromium.org (you can just subscribe to chromium-dev and choose not to receive mail). 确保您是chromium-dev@chromium.org的成员(您可以订阅 chromium-dev并选择不接收邮件)。

After that you can create a server key: 之后,您可以创建服务器密钥:

在此输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM