简体   繁体   English

Google Speech API对于某些语言返回空结果,但对其他(C#)不返回空结果

[英]Google Speech API returns empty result for some, but not the others (C#)

EMPTY result looks like this: EMPTY结果如下所示:

json[0] "{\"result\":[]}"
json[1] ""

NON-EMPTY result (desired result) looks like this: NON-EMPTY结果(所需结果)如下所示:

json[0] "{\"result\":[]}"
json[1] "{\"result\":[{\"alternative\":[{\"transcript\":\"good morning Google how are you feeling today\",\"confidence\":0.987629}],\"final\":true}],\"result_index\":0}"
json[2] ""

I have this function, that is supposed to take the ".flac" file and turn it into text. 我有此功能,应该使用“ .flac”文件并将其转换为文本。 For some reason, only these two sample ".flac" files return a string when passed through Google Speech API, other flac files return EMPTY result. 由于某些原因,只有这两个示例“ .flac”文件在通过Google Speech API传递时返回一个字符串,其他flac文件则返回EMPTY结果。 Same problem these guys are having: link 这些家伙遇到的同样问题: 链接

Here are all my flac files: link 这是我所有的flac文件: 链接

my.flac and this_is_a_test.flac work perfectly, google speech API gives me a jason object with the text in it. my.flacthis_is_a_test.flac可以完美地工作,谷歌语音API为我提供了一个带有文本的jason对象。

however, recorded.flac does NOT work with google speech API and gives me EMPTY json object. 但是, recorded.flac无法与Google Speech API一起使用,并给了我EMPTY json对象。

DEBUGGING: 调试:

  1. i thought it was the microphone that was the problem, and i recorded recorded.flac many times, loud and clear, and converted it to flac using ffmpeg. 我以为是问题所在的麦克风,所以我录制recorded.flac record.flac多次,声音清晰,然后使用ffmpeg将其转换为flac。 But google speech API still can't recognize recorded.flac 但是谷歌语音API仍然无法识别recorded.flac
  2. I thought i got the formatting wrong in the code, so i tried 我以为代码格式错误,所以我尝试了

    _HWR_SpeechToText.ContentType = "audio/116; rate=16000"; _HWR_SpeechToText.ContentType =“ audio / 116; rate = 16000”;

    instead of 代替

_HWR_SpeechToText.ContentType ="audio/x-flac; rate=44100"; _HWR_SpeechToText.ContentType =“ audio / x-flac; rate = 44100”;

Then, none of them worked, not a single flac file. so i changed it back.

Here is my google speech API code that turns FLAC files into TEXT (i don't think it is necessary, but, whatever): 这是我的Google Speech API代码,可将FLAC文件转换为TEXT(我认为这不是必需的,但是无论如何):

public void convert_to_text()
    {
        FileStream fileStream = File.OpenRead("recorded.flac");//my.flac
        MemoryStream memoryStream = new MemoryStream();
        memoryStream.SetLength(fileStream.Length);
        fileStream.Read(memoryStream.GetBuffer(), 0, (int)fileStream.Length);
        byte[] BA_AudioFile = memoryStream.GetBuffer();
        HttpWebRequest _HWR_SpeechToText = null;
        _HWR_SpeechToText = (HttpWebRequest)HttpWebRequest.Create("https://www.google.com/speech-api/v2/recognize?output=json&lang=en-us&key=" + ACCESS_GOOGLE_SPEECH_KEY);
        _HWR_SpeechToText.Credentials = CredentialCache.DefaultCredentials;
        _HWR_SpeechToText.Method = "POST";
        _HWR_SpeechToText.ContentType = "audio/x-flac; rate=44100";
        _HWR_SpeechToText.ContentLength = BA_AudioFile.Length;
        Stream stream = _HWR_SpeechToText.GetRequestStream();
        stream.Write(BA_AudioFile, 0, BA_AudioFile.Length);
        stream.Close();
        HttpWebResponse HWR_Response = (HttpWebResponse)_HWR_SpeechToText.GetResponse();

        StreamReader SR_Response = new StreamReader(HWR_Response.GetResponseStream());
        string responseFromServer = (SR_Response.ReadToEnd());

        String[] jsons = responseFromServer.Split('\n');
        foreach (String j in jsons)
        {
            dynamic jsonObject = JsonConvert.DeserializeObject(j);
            if (jsonObject == null || jsonObject.result.Count <= 0)
            {
                continue;
            }
            text = jsonObject.result[0].alternative[0].transcript;
            jsons = null;
        }
        label1.Content = text;
    }

First check that the file is 16-bit PCM Mono and not stereo. 首先检查文件是否为16位PCM单声道而不是立体声。 Easy to do with http://www.audacityteam.org/ 易于使用http://www.audacityteam.org/

在此处输入图片说明

Then you can use this simple code to do this: 然后,您可以使用以下简单代码执行此操作:

string api_key = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx";
string path = @"C:\temp\good-morning-google.flac";

byte[] bytes = System.IO.File.ReadAllBytes(path);

WebClient client = new WebClient();
client.Headers.Add("Content-Type", "audio/x-flac; rate=44100");
byte[] result = client.UploadData(string.Format(
                "https://www.google.com/speech-api/v2/recognize?client=chromium&lang=en-us&key={0}", api_key), "POST", bytes);

string s = client.Encoding.GetString(result);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM