簡體   English   中英

在 aws 轉錄作業中獲取字幕

[英]Get subtitles in aws transcribe job

我正在創建一個 function ,它從 aws 轉錄作業中獲取轉錄 output 。

def get_text(job_name, file_uri):
    job_name = job_name
    file_uri = file_uri
    transcribe_client = boto3.client('transcribe')
    max_tries = 60
    while max_tries > 0:
        max_tries -= 1
        job = transcribe_client.get_transcription_job(TranscriptionJobName=job_name)
        job_status = job['TranscriptionJob']['TranscriptionJobStatus']
        if job_status in ['COMPLETED', 'FAILED']:
            print(f"Job {job_name} is {job_status}.")
            if job_status == 'COMPLETED':
                response = urllib.request.urlopen(job['TranscriptionJob']['Transcript']['TranscriptFileUri'])
                data = json.loads(response.read())
                print(data)
                text = data['results']['transcripts'][0]['transcript']
            break
        else:
            print(f"Waiting for {job_name}. Current status is {job_status}.")
        time.sleep(10)
    return text

現在在這里我得到了 output 完美但是當我將線路job['TranscriptionJob']['Transcript']['TranscriptFileUri']更改為job['TranscriptionJob']['Subtitles']['SubtitleFileUris']時,我收到錯誤 output。 在此處輸入圖像描述

在這種情況下該怎么辦。

job['TranscriptionJob']['Subtitles']['SubtitleFileUris']是 URI 列表,而不是單個 URI。 您需要將代碼更改為這樣的

if job_status == 'COMPLETED':
    for uri in job['TranscriptionJob']['Transcript']['SubtitleFileUris']:
        response = urllib.request.urlopen(uri)
        data = json.loads(response.read())
        print(data)
        text = data['results']['transcripts'][0]['transcript']

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM