簡體   English   中英

谷歌轉錄中帶有.flac文件的RecognitionConfig錯誤

[英]RecognitionConfig error with .flac files in google transcribe

我正在嘗試使用谷歌雲轉錄音頻文件。 這是我的代碼:

from google.cloud.speech_v1 import enums
from google.cloud import speech_v1p1beta1
import os
import io


def sample_long_running_recognize(local_file_path):

    client = speech_v1p1beta1.SpeechClient()

    # local_file_path = 'resources/commercial_mono.wav'

    # If enabled, each word in the first alternative of each result will be
    # tagged with a speaker tag to identify the speaker.
    enable_speaker_diarization = True

    # Optional. Specifies the estimated number of speakers in the conversation.
    diarization_speaker_count = 2

    # The language of the supplied audio
    language_code = "en-US"
    config = {
        "enable_speaker_diarization": enable_speaker_diarization,
        "diarization_speaker_count": diarization_speaker_count,
        "language_code": language_code,
        "encoding": enums.RecognitionConfig.AudioEncoding.FLAC
    }
    with io.open(local_file_path, "rb") as f:
        content = f.read()
    audio = {"content": content}
    # audio = {"uri": storage_uri}


    operation = client.long_running_recognize(config, audio)

    print(u"Waiting for operation to complete...")
    response = operation.result()

    for result in response.results:
        # First alternative has words tagged with speakers
        alternative = result.alternatives[0]
        print(u"Transcript: {}".format(alternative.transcript))
        # Print the speaker_tag of each word
        for word in alternative.words:
            print(u"Word: {}".format(word.word))
            print(u"Speaker tag: {}".format(word.speaker_tag))


sample_long_running_recognize('/Users/asi/Downloads/trimmed_3.flac')

我不斷收到此錯誤:

google.api_core.exceptions.InvalidArgument: 400 audio_channel_count `1` in RecognitionConfig must either be unspecified or match the value in the FLAC header `2`.

我無法弄清楚我做錯了什么。 我從谷歌雲演講 API 文檔中復制並粘貼了很多內容。 有什么建議嗎?

這個屬性(audio_channel_count)是輸入音頻數據中的聲道數,MULTI-CHANNEL識別只需要設置。 我會假設這是您的情況,因此正如消息所示,您需要在配置中設置'audio_channel_count': 2以完全匹配您的音頻文件。

請查看源代碼以獲取有關 RecognitionConfig object 屬性的更多信息。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM