IBM Watson Speech對大文件的文本處理

Question

我一直在嘗試使用BlueMix SpeechToText Java庫，特別是com.ibm.watson.developer_cloud.speech_to_text.v1中的SpeechToText類。

我有很長的wav文件，我想轉換為文本。 文件大約是70MB。 目標是使用java API（ http://www.ibm.com/smarterplanet/us/en/ibmwatson/developercloud/speech-to-text/api/v1/?java#recognize ）來識別文本。 我意識到，自翻譯結束后，我需要每隔30秒檢查一次呼叫的狀態，我只有30秒的時間來檢索最終結果。

為了在使用RESTful API時這樣做，我需要創建一個會話，然后將我的搜索引擎綁定到所述會話，以便我可以查詢在會話上運行的作業的狀態。

我試圖創建一個會話但會話永遠不可用。 我已經驗證它似乎適用於提供的webapp（ https://stream.watsonplatform.net/speech-to-text/api/v1/sessions?Method=GET ）。

此外，我已經嘗試編寫自己的客戶端，我試圖設置從會話創建中檢索到的cookie，但這也不起作用。

我還試圖通過安全的websockets連接，但無法實現成功的連接。

下面是我一直在使用的一些示例代碼。

有任何想法嗎？

public class Speech2Text extends WatsonService {
private static final Logger logger = LoggerFactory           .getLogger(Speech2Text.class);
public static void main(String[] args) throws FileNotFoundException,           UnsupportedEncodingException, InterruptedException {
    Speech2Text s2t = new Speech2Text();
    s2t.httpClient();
    // try {
    // s2t.webSocketClient();
    // } catch (URISyntaxException e) {
    // TODO Auto-generated catch block
    // e.printStackTrace();
    // } catch (IOException e) {
    // TODO Auto-generated catch block
    // e.printStackTrace();
    // }
}
public void httpClient() throws FileNotFoundException,UnsupportedEncodingException {
    logger.info("Running http client");
    final Stopwatch stopwatch = Stopwatch.createStarted();
    SpeechToText service = new SpeechToText();
    service.setUsernameAndPassword("XXXXXX","XXXXX");
    List<SpeechModel> models = service.getModels();
    for (SpeechModel model : models) {
        logger.info(model.getName());
    }
    SpeechSession session = service.createSession("en-US_NarrowbandModel");
    System.out.println(session.toString());
    SessionStatus status = service.getRecognizeStatus(session);
    logger.info(status.getModel());
    logger.info(service.getEndPoint());
    File audio = new File("/home/baaron/watson-bluemix/answer_06.wav");
    Map params = new HashMap();
    params.put("audio", audio);
    params.put("content_type", "audio/wav");
    params.put("continuous", "true");
    params.put("session_id", session.getSessionId());
    logger.info(service.getEndPoint());
    SpeechResults transcript = service.recognize(params);
    PrintWriter writer = new PrintWriter("/home/baaron/watson-bluemix/PCCJPApart1test.transcript",   "UTF-8");
    writer.println(transcript.toString());
    SessionStatus status1 = service.getRecognizeStatus(session.getSessionId());
    System.out.println(status1);
    service.deleteSession(session.getSessionId());
    writer.close();
    stopwatch.stop();
    logger.info("Processing took: " + stopwatch + ".");
}
public void webSocketClient() throws URISyntaxException, IOException,
        InterruptedException {
    logger.info("Running web socket client");
    String encoding = new String(Base64.encodeBase64String("XXXXXXXXXX".getBytes()));
    HttpPost httppost = new HttpPost(                "https://stream.watsonplatform.net/authorization/api/v1/token?url=https://stream.watsonplatform.net/speech-to-text/api");
    httppost.setHeader("Authorization", "Basic " + encoding);
    System.out.println("executing request " + httppost.getRequestLine());
    DefaultHttpClient httpclient = new DefaultHttpClient();
    HttpResponse response = httpclient.execute(httppost);
    HttpEntity entity = response.getEntity();
    logger.info(response.getStatusLine().getReasonPhrase());
    WebSocketImpl.DEBUG = true;
    BufferedReader reader = new BufferedReader(new InputStreamReader(                entity.getContent()));
    StringBuilder out = new StringBuilder();
    String line;
    while ((line = reader.readLine()) != null) {
        out.append(line);
    }
    String token = out.toString();
    final WebSocketClient client = new WebSocketClient(
            new URI("wss://stream.watsonplatform.net/speech-to-text-beta/api/v1/recognize?watson-token=" + token)) {
        @Override
        public void onMessage(String message) {
            JSONObject obj = new JSONObject(message);
            // String channel = obj.getString("channel");
        }
        @Override
        public void onOpen(ServerHandshake handshake) {
            System.out.println("opened connection");
        }
        @Override
        public void onClose(int code, String reason, boolean remote) {
            System.out.println("closed connection");
        }
        @Override
        public void onError(Exception ex) {
            ex.printStackTrace();
        }
    };
    // open websocket
    SSLContext sslContext = null;
    try {
        sslContext = SSLContext.getInstance("TLS");
        sslContext.init(null, null, null); 
    } catch (NoSuchAlgorithmException e) {
        e.printStackTrace();
    } catch (KeyManagementException e) {
        e.printStackTrace();
    }
    client.setWebSocketFactory(new DefaultSSLWebSocketClientFactory(
            sslContext));
    logger.info("CONNECTED: " + client.connectBlocking());
    JSONObject obj = new JSONObject();
    obj.put("action", "start");
    obj.put("content-type", "audio/wav");
    client.send(obj.toString());
    logger.info("Done");
  }
}

Answer 1

在https://stream.watsonplatform.net/speech-to-text/api/v1/sessions上進行GET將不會列出您的會話，即使它們已創建。

檢查您是否有會話的方法是在https://stream.watsonplatform.net/speech-to-text/api/v1/sessions/yourSessionId上進行GET

如果會話在那里，您將獲得200響應，否則為404.請記住為此啟用cookie。

Answer 2

如果你想要的是轉錄音頻文件你可以做：

SpeechToText service = new SpeechToText();
service.setUsernameAndPassword("{username"}, "{password}");

RecognizeOptions options = new RecognizeOptions.Builder()
  .contentType("audio/wav")
  .continuous(true)
  .model("en-US_NarrowbandModel")
  .inactivityTimeout(-1) // Seconds after which the connection is closed if no audio is detected
  .build();

String[] files = {"file1.wav", "file2.wav"};
for (String file : files) {
  SpeechResults results = service.recognize(new File(file), options).execute();
  System.out.println(results); // print results(you could write them to a file)
}

確保使用最新版本的Java SDK。

Maven的

<dependency>
  <groupId>com.ibm.watson.developer_cloud</groupId>
  <artifactId>java-sdk</artifactId>
  <version>3.8.0</version>
</dependency>

搖籃

compile 'com.ibm.watson.developer_cloud:java-sdk:3.8.0'

IBM Watson Speech對大文件的文本處理

問題描述

2 個解決方案

解決方案1
3 2015-10-27 12:58:31

解決方案2
0 2017-04-30 21:17:17

IBM Watson Speech對大文件的文本處理

問題描述

2 個解決方案

解決方案1 3 2015-10-27 12:58:31

解決方案2 0 2017-04-30 21:17:17

解決方案1
3 2015-10-27 12:58:31

解決方案2
0 2017-04-30 21:17:17