[英]Lambda automatically deletes transcribe job upon completion
我正在尋找編輯我的 lambda 以便在其作業狀態為“完成”時刪除轉錄作業。 我有以下代碼:
import json
import time
import boto3
from urllib.request import urlopen
def lambda_handler(event, context):
transcribe = boto3.client("transcribe")
s3 = boto3.client("s3")
if event:
file_obj = event["Records"][0]
bucket_name = str(file_obj["s3"]["bucket"]["name"])
file_name = str(file_obj["s3"]["object"]["key"])
s3_uri = create_uri(bucket_name, file_name)
file_type = file_name.split("2019.")[1]
job_name = file_name
transcribe.start_transcription_job(TranscriptionJobName=job_name,
Media ={"MediaFileUri": s3_uri},
MediaFormat = file_type,
LanguageCode = "en-US",
Settings={
"VocabularyName": "Custom_Vocabulary_by_Brand_Other_Brands",
"ShowSpeakerLabels": True,
"MaxSpeakerLabels": 4
})
while True:
status = transcribe.get_transcription_job(TranscriptionJobName=job_name)
if status["TranscriptionJob"]["TranscriptionJobStatus"] in ["FAILED"]:
break
print("It's in progress")
while True:
status = transcribe.get_transcription_job(TranscriptionJobName=job_name)
if status["TranscriptionJob"]["TranscriptionJobStatus"] in ["COMPLETED"]:
transcribe.delete_transcription_job(TranscriptionJobName=job_name
)
time.sleep(5)
load_url = urlopen(status["TranscriptionJob"]["Transcript"]["TranscriptFileUri"])
load_json = json.dumps(json.load(load_url))
s3.put_object(Bucket = bucket_name, Key = "transcribeFile/{}.json".format(job_name), Body=load_json)
# TODO implement
return {
'statusCode': 200,
'body': json.dumps('Hello from Lambda!')
}
def create_uri(bucket_name, file_name):
return "s3://"+bucket_name+"/"+file_name
處理這項工作的部分是:
while True:
status = transcribe.get_transcription_job(TranscriptionJobName=job_name)
if status["TranscriptionJob"]["TranscriptionJobStatus"] in ["FAILED"]:
break
print("It's in progress")
while True:
status = transcribe.get_transcription_job(TranscriptionJobName=job_name)
if status["TranscriptionJob"]["TranscriptionJobStatus"] in ["COMPLETED"]:
transcribe.delete_transcription_job(TranscriptionJobName=job_name
)
如果作業正在進行中,它會說“正在進行中”,但當它顯示“已完成”時,它會刪除。
任何想法為什么我當前的代碼不起作用? 它會完成轉錄作業,但不會將其刪除。
如果可以避免,則不應輪詢信息,尤其是在 Lambda 中。
響應轉錄作業狀態變化的正確方法是使用 CloudWatch Events 。 例如,您可以配置規則以在轉錄作業成功完成后將事件路由到 AWS Lambda function。
當您的 Lambda function 由於轉錄作業中的狀態更改而被調用時,Lambda ZC1C425268E68385D1B45 將收到event
數據,例如:
{
"version": "0",
"id": "1a234567-1a6d-3ab4-1234-abf8b19be1234",
"detail-type": "Transcribe Job State Change",
"source": "aws.transcribe",
"account": "123456789012",
"time": "2019-11-19T10:00:05Z",
"region": "us-east-1",
"resources": [],
"detail": {
"TranscriptionJobName": "my-transcribe-test",
"TranscriptionJobStatus": "COMPLETED"
}
}
使用TranscriptionJobName
將 state 更改關聯回原始作業。
對不起,伙計們,我又看了一眼,犯了一個非常非常愚蠢的錯誤。 我有transcribe.delete_transcription_job(TranscriptionJobName=job_name
在完全不正確的部分。
請在下面找到正確且有效的代碼:
import json
import time
import boto3
from urllib.request import urlopen
def lambda_handler(event, context):
transcribe = boto3.client("transcribe")
s3 = boto3.client("s3")
if event:
file_obj = event["Records"][0]
bucket_name = str(file_obj["s3"]["bucket"]["name"])
file_name = str(file_obj["s3"]["object"]["key"])
s3_uri = create_uri(bucket_name, file_name)
file_type = file_name.split("2019.")[1]
job_name = file_name
transcribe.start_transcription_job(TranscriptionJobName=job_name,
Media ={"MediaFileUri": s3_uri},
MediaFormat = file_type,
LanguageCode = "en-US",
Settings={
"VocabularyName": "Custom_Vocabulary_by_Brand_Other_Brands",
"ShowSpeakerLabels": True,
"MaxSpeakerLabels": 4
})
while True:
status = transcribe.get_transcription_job(TranscriptionJobName=job_name)
if status["TranscriptionJob"]["TranscriptionJobStatus"] in ["COMPLETED", "FAILED"]:
transcribe.delete_transcription_job(TranscriptionJobName=job_name)
break
print("It's in progress")
time.sleep(5)
load_url = urlopen(status["TranscriptionJob"]["Transcript"]["TranscriptFileUri"])
load_json = json.dumps(json.load(load_url))
s3.put_object(Bucket = bucket_name, Key = "transcribeFile/{}.json".format(job_name), Body=load_json)
# TODO implement
return {
'statusCode': 200,
'body': json.dumps('Hello from Lambda!')
}
def create_uri(bucket_name, file_name):
return "s3://"+bucket_name+"/"+file_name
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.