簡體   English   中英

如何使用python從谷歌雲讀取mp3數據

[英]how to read mp3 data from google cloud using python

我正在嘗試從谷歌雲讀取 mp3/wav 數據並嘗試實現音頻分類技術。 問題是我無法讀取 google api 在可變響應中傳遞的結果。

下面是我的python代碼

speech_file = r'gs://pp003231/a4a.wav'
config = speech.types.RecognitionConfig(
    encoding=speech.enums.RecognitionConfig.AudioEncoding.LINEAR16,
    language_code='en-US',
    enable_speaker_diarization=True,
    diarization_speaker_count=2)
audio = speech.types.RecognitionAudio(uri=speech_file)
response = client.long_running_recognize(config, audio)
print response
result = response.results[-1]
print result

控制台上顯示的輸出是 Traceback(最近一次調用最后一次):文件“a1.py”,第 131 行,打印 response.results AttributeError: 'Operation' object has no attribute 'results'

您能否分享您對我做錯了什么的專家建議? 謝謝你的幫助。

對於這個線程的作者來說為時已晚。 但是,將來為某人發布解決方案,因為我也有類似的問題。 將 result = response.results[-1] 更改為 result = response.result().results[-1] 它將正常工作

您可以訪問存儲桶中的 wav 文件嗎? 另外,這是完整的代碼? 似乎缺少 sample_rate_hertz 和導入。 在這里,您可以從 google docs 示例中復制/粘貼代碼,但我對其進行了編輯,使其僅具有 diarization 功能。

#!/usr/bin/env python
"""Google Cloud Speech API sample that demonstrates enhanced models
and recognition metadata.
Example usage:
    python diarization.py
"""

import argparse
import io



def transcribe_file_with_diarization():
    """Transcribe the given audio file synchronously with diarization."""
    # [START speech_transcribe_diarization_beta]
    from google.cloud import speech_v1p1beta1 as speech
    client = speech.SpeechClient()



    audio = speech.types.RecognitionAudio(uri="gs://<YOUR_BUCKET/<YOUR_WAV_FILE>")

    config = speech.types.RecognitionConfig(
        encoding=speech.enums.RecognitionConfig.AudioEncoding.LINEAR16,
        sample_rate_hertz=8000,
        language_code='en-US',
        enable_speaker_diarization=True,
        diarization_speaker_count=2)

    print('Waiting for operation to complete...')
    response = client.recognize(config, audio)

    # The transcript within each result is separate and sequential per result.
    # However, the words list within an alternative includes all the words
    # from all the results thus far. Thus, to get all the words with speaker
    # tags, you only have to take the words list from the last result:
    result = response.results[-1]

    words_info = result.alternatives[0].words

    # Printing out the output:
    for word_info in words_info:
        print("word: '{}', speaker_tag: {}".format(word_info.word,
                                                   word_info.speaker_tag))
    # [END speech_transcribe_diarization_beta]



if __name__ == '__main__':

    transcribe_file_with_diarization()

要運行代碼,只需將其命名為 diarization.py 並使用以下命令:

python diarization.py

此外,您必須安裝最新的 google-cloud-speech 庫:

pip install --upgrade google-cloud-speech

並且您需要在 json 文件中包含您的服務帳戶的憑據,您可以在此處查看更多信息

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM