简体   繁体   中英

AWS Transcribe StartTransciptionJob API error

I'm trying to access a mp3 file stored in a s3 bucket I own that has Block Public Access enabled. When I upload the mp3 to my source s3 bucket, that triggers my Lambda function that should initialize the Transcribe job. I have 2 issues:

  1. I do not know if my s3 object URL used for MediaFileUri is correct. I've seen conflicting information
  2. I don't know if my bucket being private is an issue

Two CloudWatch error messages:

"An error occurred (BadRequestException) when calling the StartTranscriptionJob operation: 1 validation error detected: Value 'source/2004-DNC.mp3' at 'transcriptionJobName' failed to satisfy constraint: Member must satisfy regular expression pattern: ^[0-9a-zA-Z._-]+"

"An error occurred (BadRequestException) when calling the StartTranscriptionJob operation: The S3 URI that you provided can't be accessed. Make sure that you have read permission and try your request again."

Lambda Function

import boto3

s3 = boto3.client('s3')
transcribe = boto3.client('transcribe')

def lambda_handler(event, context):
    for record in event['Records']:
        source_bucket = record['s3']['bucket']['name']
        key = record['s3']['object']['key']
        object_url = "s3://{0}/{1}".format(source_bucket, key)
    
        response = transcribe.start_transcription_job(
            TranscriptionJobName=key,
            Media={'MediaFileUri': object_url},
            MediaFormat='mp3',
            LanguageCode='en-US',
        )
        print(response)

IAM Policy

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "logs:CreateLogGroup",
                "logs:CreateLogStream",
                "logs:PutLogEvents"
            ],
            "Resource": "arn:aws:logs:*:*:*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:GetObject"
            ],
            "Resource": "arn:aws:s3:::abcdefghijk-transcribe-source/*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "transcribe:StartTranscriptionJob"
            ],
            "Resource": "*"
        }
    ]
}

The answer to this error had to parts: (1) The structure of object_url variable, not the permissions aka IAM policy. Previous documentation listed the Path Style format that look like this: https://s3.us-west-2.amazonaws.com/BUCKET-NAME/OBJECT-KEY <<< This style has been deprecated. It is documented here

Using the Virtual Host format of https://BUCKET-NAME.s3.amazonaws.com/OBJECT-KEY resolved my issue.

(2) TranscriptionJobName requires the object key as a 'string'.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM