简体   繁体   English

我可以在我的桌面应用程序中使用谷歌语音识别 api 吗

[英]can i use google speech recognition api in my desktop application

I want to know whether i can use speech recognition api of google for my desktop application.我想知道我是否可以将谷歌的语音识别 api 用于我的桌面应用程序。 I have seen some example in which i have to convert the speech to a file and send to a url.我看过一些示例,其中我必须将语音转换为文件并发送到 url。 But it will be little cumbersome task because in my application the user have to continuously submit his voice.但这不会是一项繁琐的任务,因为在我的应用程序中,用户必须不断提交他的声音。 So is there any other alternative to use google speech api.那么有没有其他替代方法可以使用谷歌语音 api。 I am least interested to go with sphinx because its accuracy is very less and i dont know how to add new words in the dictionary and without adding it to dictionary it wont recognize new words.我对使用 sphinx 最不感兴趣,因为它的准确性非常低,而且我不知道如何在字典中添加新单词,如果不将其添加到字典中,它就无法识别新单词。 Any help would be appreciated.任何帮助,将不胜感激。

Are you referring to ambient listening?你指的是环境聆听吗? I am actually working on some Voice Activity Detection algorithm with the Google Speech Recognition API.我实际上正在使用 Google Speech Recognition API 研究一些语音活动检测算法。 Although I haven't finished the algorithm yet, I've added a volume and frequency calculator so that you don't have to send requests to Google when the person is not talking.虽然我还没有完成算法,但我已经添加了一个音量和频率计算器,这样你就不必在对方不说话的时候向谷歌发送请求。 Here is the link to the source code.这是源代码的链接。

https://github.com/The-Shadow/java-speech-api https://github.com/The-Shadow/java-speech-api

(This isn't what I use, but it's simplistic. You can also add frequency threshold holds and stuff. I threw this code together so no guarantee it will work look at the example branch of the API.) (这不是我使用的,但它很简单。你也可以添加频率阈值保持和东西。我把这段代码放在一起,所以不能保证它会起作用,看看 API 的示例分支。)

//package recognitionprocess;
//import org.jaudiotagger.audio.*;


import java.io.FileOutputStream;
import java.io.IOException;
import java.io.RandomAccessFile;

import javax.sound.sampled.AudioFileFormat;

import com.darkprograms.speech.recognizer.GoogleResponse;
import com.darkprograms.speech.recognizer.Recognizer;

public class RecognitionMain {

    public static void main(String[] args)  {
        try{
        ambientListening();
        }
        catch(Exception e){
            e.printStackTrace();
        }
    }

    private static void ambientListening() throws Exception{

        String filename = "tarunaudio.wav";//Your Desired FileName
        MicrophoneAnalyzer mic = new MicrophoneAnalyzer(AudioFileFormat.Type.WAVE);
       mic.open();
        mic.captureAudioToFile(filename);
        final int THRESHOLD = 10;//YOUR THRESHOLD VALUE.
        int ambientVolume = mic.getAudioVolume();//
        int speakingVolume = -2;
        boolean speaking = false;
            for(int i = 0; i<1||speaking; i++){
                int volume = mic.getAudioVolume();
                System.out.println(volume);
                if(volume>ambientVolume+THRESHOLD){
                    speakingVolume = volume;
                    speaking = true;
                    Thread.sleep(1000);
                    System.out.println("SPEAKING");
                }
                if(speaking && volume+THRESHOLD<speakingVolume){
                     break;
                }
                Thread.sleep(200);//Your refreshRate
            }
              mic.close();
            //You can also measure the volume across the entire file if you want
            //to be resource intensive.
            if(!speaking){
                 ambientListening();
            }
        Recognizer rec = new Recognizer(Recognizer.Languages.ENGLISH_US);
        GoogleResponse out = rec.getRecognizedDataForWave(filename);
        System.out.println(out.getResponse());
        ambientListening();
    }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM