简体   繁体   中英

AWS Lambda connection to SQS timed out

I am working on an task which involves Lambda function running inside VPC .

This function is supposed to push messages to SQS and lambda execution role has policies: AWSLambdaSQSQueueExecutionRole and AWSLambdaVPCAccessExecutionRole added.

Lambda functions:

# Create SQS client
sqs = boto3.client('sqs')

queue_url = 'https://sqs.ap-east-1a.amazonaws.com/073x08xx43xx37/xyz-queue'

# Send message to SQS queue
response = sqs.send_message(
    QueueUrl=queue_url,
    DelaySeconds=10,
    MessageAttributes={
        'Title': {
            'DataType': 'String',
            'StringValue': 'Tes1'
        },
        'Author': {
            'DataType': 'String',
            'StringValue': 'Test2'
        },
        'WeeksOn': {
            'DataType': 'Number',
            'StringValue': '1'
        }
    },
    MessageBody=(
        'Testing'
     )
)

print(response['MessageId'])

On testing the execution result is as:

{
  "errorMessage": "2020-07-24T12:12:15.924Z f8e794fc-59ba-43bd-8fee-57f417fa50c9 Task timed out after 3.00 seconds"
}

I increased the Timeout from Basic Settings to 5 seconds & 10 seconds as well. But the error kept coming.

If anyone has faced similar issue in past or is having an idea how to get this resolved, Please help me out.

Thanks you in advance.

When an AWS Lambda function is configured to use an Amazon VPC, it connects to a nominated subnet of the VPC. This allows the Lambda function to communicate with other resources inside the VPC. However, it cannot communicate with the Internet . This is a problem because the Amazon SQS public endpoint lives on the Internet and the function is timing-out because it is unable to reach the Internet.

Thus, you have 3 options:

Option 1: Do not connect to a VPC

If your Lambda function does not need to communicate with a resource in the VPC (such as the simple function you have provided above), simply do not connect it to the VPC . When a Lambda function is not connected to a VPC, it can communicate with the Internet and the Amazon SQS public endpoint.

Option 2: Use a VPC Endpoint

A VPC Endpoint provides a means of accessing an AWS service without going via the Internet . You would configure a VPC endpoint for Amazon SQS. Then, when the Lambda function wishes to connect with the SQS queue, it can access SQS via the endpoint rather than via the Internet. This is normally a good option if the Lambda function needs to communicate with other resources in the VPC.

Option 3: Use a NAT Gateway

If the Lambda function is configured to use a private subnet, it will be able to access the Internet if a NAT Gateway has been provisioned in a public subnet and the Route Table for the private subnet points to the NAT Gateway. This involves extra expense and is only worthwhile if there is an additional need for a NAT Gateway.

You need to place your lambda inside your VPC then set up a VPC endpoint for SQS or NAT gateway, When you add your lambda function to a subnet, make sure you ONLY add it to the private subnets, otherwise nothing will work.

Reference

https://docs.aws.amazon.com/lambda/latest/dg/vpc.html

https://aws.amazon.com/premiumsupport/knowledge-center/internet-access-lambda-function/

I am pretty convinced that you cannot call an SQS queue from within a VPC using Lambda using an SQS endpoint. I'd consider it a bug, but maybe the Lambda team did this for a reason. In any case, You will get a message timeout. I cooked up a simple test Lambda

import json
import boto3
import socket

def lambda_handler(event, context):
    print('lambda-test SQS...')
    sqsDomain='sqs.us-west-2.amazonaws.com'
    
    addr1 = socket.gethostbyname(sqsDomain)
    print('%s=%s' %(sqsDomain, addr1))
     
    print('Creating sqs client...')
    sqs = boto3.client('sqs')
    
    print('Sending Test Message...')
    response = sqs.send_message(
            QueueUrl='https://sqs.us-west-2.amazonaws.com/1234567890/testq.fifo',
            MessageBody='Test SQS Lambda!',
            MessageGroupId='test')
            
    print('SQS send response: %s' % response)

    return {
        'statusCode': 200,
        'body': json.dumps(response)
    }

I created a VPC, subnet, etc per - Configuring a Lambda function to access resources in a VPC . The EC2 instance in this example has no problem invoking SQS through the private endpoint from the CLI per this tutorial.

If I drop my simple Lambda above into the same VPC and subnet, with SQS publishing permissions etc. and invoke the test function it will properly resolve the IP address of the SQS endpoint within the subnet, but the call will timeout (making sure your Lambda timeout is more than 60 seconds to let boto fail). Enabling boto debug logging further confirms that the IP is resolved correctly and the HTTP request to SQS times out.

I didn't try this with a non-FIFO queue but as the HTTP call is failing on connection request this shouldn't matter. It's got to be a routing issue from the Lambda as the EC2 in the same subnet works.

I modified my simple Lambda and added an SNS endpoint and did the same test which worked. The issue issue appears to be specific to SQS best I can tell.

import json
import boto3
import socket

def testSqs():
    print('lambda-test SQS...')
    sqsDomain='sqs.us-west-2.amazonaws.com'
    
    addr1 = socket.gethostbyname(sqsDomain)
    print('%s=%s' %(sqsDomain, addr1))
    
    print('Creating sqs client...')
    sqs = boto3.client('sqs')
    
    print('Sending Test Message...')
    response = sqs.send_message(
            QueueUrl='https://sqs.us-west-2.amazonaws.com/1234567890/testq.fifo',
            MessageBody='Test SQS Lambda!',
            MessageGroupId='test')
            
    print('SQS send response: %s' % response)

    return {
        'statusCode': 200,
        'body': json.dumps(response)
    }
    

def testSns():
    print('lambda-test SNS...')

    print('Creating sns client...')
    sns = boto3.client('sns')
    
    print('Sending Test Message...')
    response = sns.publish(
            TopicArn='arn:aws:sns:us-west-2:1234567890:lambda-test',
            Message='Test SQS Lambda!'
            )
            
    print('SNS send response: %s' % response)

    return {
        'statusCode': 200,
        'body': json.dumps(response)
    }
    

def lambda_handler(event, context):
    #return testSqs()
    return testSns()

I think your only options are NAT (per John above), bounce your calls off a local EC2 (NAT will be simpler, cheaper, and more reliable), or use a Lambda proxy outside the VPC. Which someone else suggested in a similar post. You could also subscribe an SQS queue to an SNS topic (I prototyped this and it works) and route it out that way too, but that just seems silly unless you absolutely have to have SQS for some obscure reason.

I switched to SNS. I was just hoping to get some more experience with SQS. Hopefully somebody can prove me wrong, but I call it a bug.

If you're using the boto3 python library in a lambda in a VPC, and it's failing to connect to an sqs queue through a vpc endpoint, you must set the endpoint_url when creating the sqs client. Issue 1900 describes the background behind this.

The solution looks like this (for an sqs vpc endpoint in us-east-1):

sqs_client = boto3.client('sqs',
    endpoint_url='https://sqs.us-east-1.amazonaws.com')

Then call send_message or send_message_batch as normal.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM