简体   繁体   English

如何将长语音转换为文本。 Python 中的 Google 异步语音识别

[英]How to convert long speech to text. Google asyncronous speech recognition in Python

Does anyone know how to convert long speech in wav or MP3 format to text, by using Google speech to text asyncronous API?有谁知道如何通过使用谷歌语音到文本异步 API 将 wav 或 MP3 格式的长语音转换为文本? I am using the code below but I always get an error finding upload_blob() function.我正在使用下面的代码,但在查找 upload_blob() function 时总是出错。

Thanks in advance.提前致谢。

from pydub import AudioSegment
import io
import os
from google.cloud import speech
from google.cloud.speech import enums
from google.cloud.speech import types
import wave
from google.cloud import storage

filepath = "/home/marte/Documentos/Python_Scripts/Law/" 
output_filepath = "/home/marte/Documentos/Python_Scripts/Law/" 
bucketname = "callsaudiofiles" 
def frame_rate_channel(audio_file_name):
    with wave.open(audio_file_name, "rb") as wave_file:
        frame_rate = wave_file.getframerate()
        channels = wave_file.getnchannels()
        return frame_rate,channels
def google_transcribe(audio_file_name):
    file_name = filepath + audio_file_name 
    frame_rate, channels = frame_rate_channel(audio_file_name)
    bucket_name = bucketname
    source_file_name = filepath + audio_file_name
    destination_blob_name = audio_file_name
    upload_blob(bucket_name, source_file_name, destination_blob_name)
    gcs_uri = 'gs://' + bucketname + '/' + audio_file_name
    transcript = ''
    client = speech.SpeechClient()
    audio = types.RecognitionAudio(uri=gcs_uri)
    config = types.RecognitionConfig(
    encoding=enums.RecognitionConfig.AudioEncoding.LINEAR16,
    sample_rate_hertz=frame_rate,
    language_code='es-CO')
    operation = client.long_running_recognize(config, audio)
    response = operation.result(timeout=10000)
    for result in response.results:
        transcript += result.alternatives[0].transcript
    delete_blob(bucket_name, destination_blob_name)
    return transcript
def write_transcripts(transcript_filename,transcript):
    f= open(output_filepath + transcript_filename,"w+")
    f.write(transcript)
    f.close()
sentencia = google_transcribe("onu-santos.wav")
write_transcripts("onu-santos.txt",sentencia)

I finally manage to solve the problem by setting a new "segment" in my Google Cloud Storage account and setting the credentials as a local variable by:我终于设法通过在我的 Google Cloud Storage 帐户中设置一个新的“段”并将凭据设置为本地变量来解决问题:

 export GOOGLE_APPLICATION_CREDENTIALS='/home/marte/Documentos/Python_Scripts/Speech/sequoia-31501208ed06.json'

This link was useful: https://cloud.google.com/docs/authentication/getting-started此链接很有用: https://cloud.google.com/docs/authentication/getting-started

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM