I'm trying to import audios (mp3, m4a or flac) into this code in order to access the Watson API and get a transcript. I've tried with different audio files, extracted from video or recorded directly by windows recorder. All of them with different sizes close to 1 to 12MB. But always return this error below. I didn't find answers on other websites with similar questions.
pip install ibm_watson
apikey = 'xxxx'
url = 'yyyy'
from ibm_watson import SpeechToTextV1
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator
import subprocess
import os
authenticator = IAMAuthenticator(apikey)
stt = SpeechToTextV1(authenticator = authenticator)
stt.set_service_url(url)
f= "file_path"
res = stt.recognize(audio=f, content_type='audio/m4a', model='en-US_NarrowbandModel', continuous=True, inactivity_timeout=360).get_result()
ApiException Traceback (most recent call last)
<ipython-input-24-cfbd4e46f426> in <module>()
3 f= "file_path"
4 res = stt.recognize(audio=f, content_type='audio/m4a', model='en-US_NarrowbandModel', continuous=True,
----> 5 inactivity_timeout=360).get_result()
1 frames
/usr/local/lib/python3.7/dist-packages/ibm_cloud_sdk_core/base_service.py in send(self, request, **kwargs)
300 status_code=response.status_code)
301
--> 302 raise ApiException(response.status_code, http_response=response)
303 except requests.exceptions.SSLError:
304 logging.exception(self.ERROR_MSG_DISABLE_SSL)
ApiException: Error: Stream was 9 bytes but needs to be at least 100 bytes., Code: 400 , X-global-transaction-id: 927c7d31-c030-4d71-8998-aa544b1ae111
I can't test it but error shows Stream was 9 bytes
and len("file_path")
gives 9
.
Probably it needs
audio=open(f, 'rb').read()
instead of audio=f
EDIT:
Documentation for recognize() shows example which uses
with open('file_path', 'rb') as audio_file:
speech_to_text.recognize(audio=audio_file, ...)
so it means you may need
audio=open(f, 'rb')
without .read()
Full example from documentation:
import json
from os.path import join, dirname
from ibm_watson import SpeechToTextV1
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator
authenticator = IAMAuthenticator('{apikey}')
speech_to_text = SpeechToTextV1(
authenticator=authenticator
)
speech_to_text.set_service_url('{url}')
with open(join(dirname(__file__), './.', 'audio-file2.flac'),
'rb') as audio_file:
speech_recognition_results = speech_to_text.recognize(
audio=audio_file,
content_type='audio/flac',
word_alternatives_threshold=0.9,
keywords=['colorado', 'tornado', 'tornadoes'],
keywords_threshold=0.5
).get_result()
print(json.dumps(speech_recognition_results, indent=2))
You have to click link recognize() and scroll down to see it in documentation.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.