简体   繁体   中英

How to resolve the error 400 using Watson speech to text api

I'm trying to import audios (mp3, m4a or flac) into this code in order to access the Watson API and get a transcript. I've tried with different audio files, extracted from video or recorded directly by windows recorder. All of them with different sizes close to 1 to 12MB. But always return this error below. I didn't find answers on other websites with similar questions.

pip install ibm_watson
apikey = 'xxxx'
url = 'yyyy'

from ibm_watson import SpeechToTextV1
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator
import subprocess
import os

authenticator = IAMAuthenticator(apikey)
stt = SpeechToTextV1(authenticator = authenticator)
stt.set_service_url(url)

f= "file_path"
res = stt.recognize(audio=f, content_type='audio/m4a', model='en-US_NarrowbandModel', continuous=True, inactivity_timeout=360).get_result()

ApiException                              Traceback (most recent call last)
<ipython-input-24-cfbd4e46f426> in <module>()
      3 f= "file_path"
      4 res = stt.recognize(audio=f, content_type='audio/m4a', model='en-US_NarrowbandModel', continuous=True,
----> 5                     inactivity_timeout=360).get_result()

1 frames
/usr/local/lib/python3.7/dist-packages/ibm_cloud_sdk_core/base_service.py in send(self, request, **kwargs)
    300                                         status_code=response.status_code)
    301 
--> 302             raise ApiException(response.status_code, http_response=response)
    303         except requests.exceptions.SSLError:
    304             logging.exception(self.ERROR_MSG_DISABLE_SSL)

ApiException: Error: Stream was 9 bytes but needs to be at least 100 bytes., Code: 400 , X-global-transaction-id: 927c7d31-c030-4d71-8998-aa544b1ae111

I can't test it but error shows Stream was 9 bytes and len("file_path") gives 9 .

Probably it needs

audio=open(f, 'rb').read() 

instead of audio=f


EDIT:

Documentation for recognize() shows example which uses

with open('file_path', 'rb') as audio_file:

    speech_to_text.recognize(audio=audio_file, ...)

so it means you may need

audio=open(f, 'rb')

without .read()


Full example from documentation:

import json
from os.path import join, dirname
from ibm_watson import SpeechToTextV1
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator

authenticator = IAMAuthenticator('{apikey}')
speech_to_text = SpeechToTextV1(
    authenticator=authenticator
)

speech_to_text.set_service_url('{url}')

with open(join(dirname(__file__), './.', 'audio-file2.flac'),
               'rb') as audio_file:
    speech_recognition_results = speech_to_text.recognize(
        audio=audio_file,
        content_type='audio/flac',
        word_alternatives_threshold=0.9,
        keywords=['colorado', 'tornado', 'tornadoes'],
        keywords_threshold=0.5
    ).get_result()
print(json.dumps(speech_recognition_results, indent=2))

You have to click link recognize() and scroll down to see it in documentation.

在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM