簡體   English   中英

如何將Android語音中的輸入更改為文本

[英]How to change the input in android speech to text

我是android編程的新手,最近發現了android上的text api語音。 我在網絡上發現了很多tuto,它們很好地解釋了如何使用此功能,但是它們都以相同的方式工作:應用程序使用意圖來啟動識別,而在編寫程序時,您沒有指定輸入。

我的問題是:是否可以像在音頻記錄中那樣做,並精確確定我們要使用哪個音頻源? (例如,使用MediaRecorder.AudioSource.MIC)?

我認為這是執行此操作的標准方法,但這是我實現SpeechToText的方式:

private void askSpeechInput() {
    Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
    intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL,
            RecognizerIntent.LANGUAGE_MODEL_FREE_FORM);
    intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE, Locale.US);

    try {
        startActivityForResult(intent, REQ_CODE_SPEECH_INPUT);
    } catch (ActivityNotFoundException a) {

    }
}

他們用我得到的文字做我想做的一切

@Override
public void onActivityResult(int requestCode, int resultCode, Intent data) {
    super.onActivityResult(requestCode, resultCode, data);

    switch (requestCode) {
        case REQ_CODE_SPEECH_INPUT: {
            if (resultCode == RESULT_OK && null != data) {
                ArrayList<String> result = data.getStringArrayListExtra(RecognizerIntent.EXTRA_RESULTS);
                String message = "";
                message = result.get(0);
                //Do whatever i want with my message
            }
            break;
        }
    }
}

因此,此代碼可以接收麥克風輸入,但是可以更改它嗎?

好吧,我不知道它是否對任何人都有幫助,但是我找到了解決該問題的方法。

首先,由於使用了MediaRecorder.AudioSource,我使用了一個錄音機通過我想要的輸入來錄制聲音,並將其保存到文件中。

private void startRecording() {
    recorder = new AudioRecord(MediaRecorder.AudioSource.MIC,
            RECORDER_SAMPLERATE, RECORDER_CHANNELS,
            RECORDER_AUDIO_ENCODING, BufferElements2Rec * BytesPerElement);
    recorder.startRecording();
    isRecording = true;
    recordingThread = new Thread(new Runnable() {
        public void run() {
            writeAudioDataToFile();
        }
    }, "AudioRecorder Thread");
    recordingThread.start();
}

之后,我使用找到的flac編碼器將.wav編碼為.flac。

最后,我找到了一些代碼,可讓我直接將flac文件發送到Google API,並接收所需的文本!

    public void getTranscription(int sampleRate) {

    File myfil = new File(fileName);
    if (!myfil.canRead()) {
        Log.d("ParseStarter", "FATAL no read access");
        System.out.println("FATAL CAN'T READ");
    }

    // first is a GET for the speech-api DOWNSTREAM
    // then a future exec for the UPSTREAM / chunked encoding used so as not
    // to limit
    // the POST body sz

    PAIR = MIN + (long) (Math.random() * ((MAX - MIN) + 1L));
    // DOWN URL just like in curl full-duplex example plus the handler
    downChannel(API_DOWN_URL + PAIR, messageHandler);

    // UP chan, process the audio byteStream for interface to UrlConnection
    // using 'chunked-encoding'
    FileInputStream fis;
    try {
        fis = new FileInputStream(myfil);
        FileChannel fc = fis.getChannel(); // Get the file's size and then
        // map it into memory
        int sz = (int) fc.size();
        MappedByteBuffer bb = fc.map(FileChannel.MapMode.READ_ONLY, 0, sz);
        byte[] data2 = new byte[bb.remaining()];
        Log.d("ParseStarter", "mapfil " + sz + " " + bb.remaining());
        bb.get(data2);
        // conform to the interface from the curl examples on full-duplex
        // calls
        // see curl examples full-duplex for more on 'PAIR'. Just a globally
        // uniq value typ=long->String.
        // API KEY value is part of value in UP_URL_p2
        upChannel(root + up_p1 + PAIR + up_p2 + api_key, messageHandler2,
                data2);
    } catch (FileNotFoundException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    } catch (IOException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }
}

private void downChannel(String urlStr, final Handler messageHandler) {

    final String url = urlStr;

    new Thread() {
        Bundle b;

        public void run() {
            String response = "NAO FOI";
            Message msg = Message.obtain();
            msg.what = 1;
            // handler for DOWN channel http response stream - httpsUrlConn
            // response handler should manage the connection.... ??
            // assign a TIMEOUT Value that exceeds by a safe factor
            // the amount of time that it will take to write the bytes
            // to the UPChannel in a fashion that mimics a liveStream
            // of the audio at the applicable Bitrate. BR=sampleRate * bits
            // per sample
            // Note that the TLS session uses
            // "* SSLv3, TLS alert, Client hello (1): "
            // to wake up the listener when there are additional bytes.
            // The mechanics of the TLS session should be transparent. Just
            // use
            // httpsUrlConn and allow it enough time to do its work.
            Scanner inStream = openHttpsConnection(url);
            // process the stream and store it in StringBuilder
            while (inStream.hasNextLine()) {
                b = new Bundle();
                b.putString("text", inStream.nextLine());
                msg.setData(b);
                messageHandler.dispatchMessage(msg);
            }

        }
    }.start();
}

private void upChannel(String urlStr, final Handler messageHandler,
                       byte[] arg3) {

    final String murl = urlStr;
    final byte[] mdata = arg3;
    Log.d("ParseStarter", "upChan " + mdata.length);
    new Thread() {
        public void run() {
            String response = "NAO FOI";
            Message msg = Message.obtain();
            msg.what = 2;
            Scanner inStream = openHttpsPostConnection(murl, mdata);
            inStream.hasNext();
            // process the stream and store it in StringBuilder
            while (inStream.hasNextLine()) {
                response += (inStream.nextLine());
                Log.d("ParseStarter", "POST resp " + response.length());
            }
            Bundle b = new Bundle();
            b.putString("post", response);
            msg.setData(b);
            // in.close(); // mind the resources
            messageHandler.sendMessage(msg);

        }
    }.start();

}

// GET for DOWNSTREAM
private Scanner openHttpsConnection(String urlStr) {
    InputStream in = null;
    int resCode = -1;
    Log.d("ParseStarter", "dwnURL " + urlStr);

    try {
        URL url = new URL(urlStr);
        URLConnection urlConn = url.openConnection();

        if (!(urlConn instanceof HttpsURLConnection)) {
            throw new IOException("URL is not an Https URL");
        }

        HttpsURLConnection httpConn = (HttpsURLConnection) urlConn;
        httpConn.setAllowUserInteraction(false);
        // TIMEOUT is required
        httpConn.setInstanceFollowRedirects(true);
        httpConn.setRequestMethod("GET");

        httpConn.connect();

        resCode = httpConn.getResponseCode();
        if (resCode == HttpsURLConnection.HTTP_OK) {
            return new Scanner(httpConn.getInputStream());
        }

    } catch (MalformedURLException e) {
        e.printStackTrace();
    } catch (IOException e) {
        e.printStackTrace();
    }
    return null;
}

// GET for UPSTREAM
private Scanner openHttpsPostConnection(String urlStr, byte[] data) {
    InputStream in = null;
    byte[] mextrad = data;
    int resCode = -1;
    OutputStream out = null;
    // int http_status;
    try {
        URL url = new URL(urlStr);
        URLConnection urlConn = url.openConnection();

        if (!(urlConn instanceof HttpsURLConnection)) {
            throw new IOException("URL is not an Https URL");
        }

        HttpsURLConnection httpConn = (HttpsURLConnection) urlConn;
        httpConn.setAllowUserInteraction(false);
        httpConn.setInstanceFollowRedirects(true);
        httpConn.setRequestMethod("POST");
        httpConn.setDoOutput(true);
        httpConn.setChunkedStreamingMode(0);
        httpConn.setRequestProperty("Content-Type", "audio/x-flac; rate="
                + rate );
        httpConn.connect();

        try {
            // this opens a connection, then sends POST & headers.
            out = httpConn.getOutputStream();
            // Note : if the audio is more than 15 seconds
            // dont write it to UrlConnInputStream all in one block as this
            // sample does.
            // Rather, segment the byteArray and on intermittently, sleeping
            // thread
            // supply bytes to the urlConn Stream at a rate that approaches
            // the bitrate ( =30K per sec. in this instance ).
            Log.d("ParseStarter", "IO beg on data");
            out.write(mextrad); // one big block supplied instantly to the
            // underlying chunker wont work for duration
            // > 15 s.
            Log.d("ParseStarter", "IO fin on data");
            // do you need the trailer?
            // NOW you can look at the status.
            resCode = httpConn.getResponseCode();

            Log.d("ParseStarter", "POST OK resp "
                    + httpConn.getResponseMessage().getBytes().toString());

            if (resCode / 100 != 2) {
                Log.d("ParseStarter", "POST bad io ");
            }

        } catch (IOException e) {
            Log.d("ParseStarter", "FATAL " + e);

        }

        if (resCode == HttpsURLConnection.HTTP_OK) {
            Log.d("ParseStarter", "OK RESP to POST return scanner ");
            return new Scanner(httpConn.getInputStream());
        }
    } catch (MalformedURLException e) {
        e.printStackTrace();
    } catch (IOException e) {
        e.printStackTrace();
    }
    return null;
}







// DOWN handler
Handler messageHandler = new Handler() {

    public void handleMessage(Message msg) {
        super.handleMessage(msg);
        switch (msg.what) {
            case 1: // GET DOWNSTREAM json id="@+id/comment"
                String mtxt = msg.getData().getString("text");
                if (mtxt.length() > 20) {
                    final String f_msg = mtxt;
                    handler.post(new Runnable() { // This thread runs in the UI
                        // TREATMENT FOR GOOGLE RESPONSE
                        @Override
                        public void run() {
                            System.out.println(f_msg);


                            String message = "";
                            final ChatMessage chatMessage = new ChatMessage(user1, user2,
                                    message, "" + random.nextInt(1000), true);
                            message = f_msg;
                            chatMessage.setMsgID();
                            chatMessage.setMsgID();
                            chatMessage.body = message;
                            chatMessage.Date = CommonMethods.getCurrentDate();
                            chatMessage.Time = CommonMethods.getCurrentTime();
                            msg_edittext.setText("");
                            chatAdapter.add(chatMessage);
                            chatAdapter.notifyDataSetChanged();
                        }
                    });
                }
                break;
            case 2:
                break;
        }
    }
}; // doDOWNSTRM Handler end

// UPSTREAM channel. its servicing a thread and should have its own handler
Handler messageHandler2 = new Handler() {

    public void handleMessage(Message msg) {
        super.handleMessage(msg);
        switch (msg.what) {
            case 1: // GET DOWNSTREAM json
                Log.d("ParseStarter", msg.getData().getString("post"));
                break;
            case 2:
                Log.d("ParseStarter", msg.getData().getString("post"));
                break;
        }

    }
}; // UPstream handler end

我從項目中獲得了這部分代碼,與Google api的連接有效,但文件編碼器似乎已過時。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM