简体   繁体   English

在Python和Google Speech API中将声音文件转录为文本

[英]Transcribing sound files to text in python and google speech api

I have a bunch of files in wav. 我在WAV中有一堆文件。 I made a simple script to convert them to flac so I can use it with the google speech api. 我做了一个简单的脚本,将它们转换为flac,因此可以将其与Google Speech API结合使用。 Here is the python code: 这是python代码:

import urllib2
url = "https://www.google.com/speech-api/v1/recognize?client=chromium&lang=en-US"
audio = open('somefile.flac','rb').read()
headers={'Content-Type': 'audio/x-flac; rate=16000', 'User-Agent':'Mozilla/5.0'}
request = urllib2.Request(url, data=audio, headers=headers)
response = urllib2.urlopen(request)
print response.read()

However I am getting this error: 但是我收到此错误:

Traceback (most recent call last):
  File "transcribe.py", line 7, in <module>
    response = urllib2.urlopen(request)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 126, in urlopen
    return _opener.open(url, data, timeout)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 392, in open
    response = self._open(req, data)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 410, in _open
    '_open', req)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 370, in _call_chain
    result = func(*args)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 1194, in https_open
    return self.do_open(httplib.HTTPSConnection, req)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 1161, in do_open
    raise URLError(err)
urllib2.URLError: <urlopen error [Errno 32] Broken pipe>

I thought at first that it was because the file is too big. 起初我以为是因为文件太大。 But I recorded myself for 5 seconds and it still does the same. 但是我记录了自己5秒钟,但仍然如此。

I dont think google ha released the api yet so it's hard to understand why its failing. 我认为Google ha还没有发布该api,因此很难理解为什么它会失败。

Is there any other good speech-to-text api out there that can be used in either Python or Node? 是否还有其他可以在Python或Node中使用的其他出色的语音转文本api?

----- Editing for my attempt with requests: -----根据请求编辑我的尝试:

import json
import requests
url = 'https://www.google.com/speech-api/v1/recognize?client=chromium&lang=en-US'
data = {'file': open('file.flac', 'rb')}
headers = {'Content-Type': 'audio/x-flac; rate=16000', 'User-Agent':'Mozilla/5.0'}
r = requests.post(url, data=data, headers=headers)
# r = requests.post(url, files=data, headers=headers) ## does not work either
# r = requests.post(url, data=open('file.flac', 'rb').read(), headers=headers) ## does not work either
print r.text

Produced the same problem as above. 产生与上述相同的问题。

The API Accepts HTTP POST requests. 该API接受HTTP POST请求。 You're using a HTTP GET Request here. 您在此处使用HTTP GET请求。 This can be confirmed by loading the URI in your code directly into a browser: 可以通过将代码中的URI直接加载到浏览器中来确认:

HTTP method GET is not supported by this URL

Error 405

Also, i'd recommend using the requests python library. 另外,我建议使用requests python库。 See http://www.python-requests.org/en/latest/user/quickstart/#post-a-multipart-encoded-file 参见http://www.python-requests.org/en/latest/user/quickstart/#post-a-multipart-encoded-file

Lastly, it seems that the API only accepts segments up to 15 seconds long. 最后,似乎API仅接受长达15秒的段。 Perhaps your error is the file is too large? 也许您的错误是文件太大? If you can upload an example flac file, perhaps we could diagnose further. 如果您可以上传flac文件示例,也许我们可以进一步诊断。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM