简体   繁体   English

AWS Lambda 到 SQS 的连接超时

[英]AWS Lambda connection to SQS timed out

I am working on an task which involves Lambda function running inside VPC .我正在处理一项涉及在 VPC 内运行Lambda function的任务。

This function is supposed to push messages to SQS and lambda execution role has policies: AWSLambdaSQSQueueExecutionRole and AWSLambdaVPCAccessExecutionRole added.这个 function 应该将消息推送到SQS和 lambda 执行角色具有策略: AWSLambdaSQSQueueExecutionRoleAWSLambdaVPCAccessExecutionRole添加。

Lambda functions: Lambda 功能:

# Create SQS client
sqs = boto3.client('sqs')

queue_url = 'https://sqs.ap-east-1a.amazonaws.com/073x08xx43xx37/xyz-queue'

# Send message to SQS queue
response = sqs.send_message(
    QueueUrl=queue_url,
    DelaySeconds=10,
    MessageAttributes={
        'Title': {
            'DataType': 'String',
            'StringValue': 'Tes1'
        },
        'Author': {
            'DataType': 'String',
            'StringValue': 'Test2'
        },
        'WeeksOn': {
            'DataType': 'Number',
            'StringValue': '1'
        }
    },
    MessageBody=(
        'Testing'
     )
)

print(response['MessageId'])

On testing the execution result is as:测试执行结果如下:

{
  "errorMessage": "2020-07-24T12:12:15.924Z f8e794fc-59ba-43bd-8fee-57f417fa50c9 Task timed out after 3.00 seconds"
}

I increased the Timeout from Basic Settings to 5 seconds & 10 seconds as well.我也将超时从基本设置增加到 5 秒和 10 秒。 But the error kept coming.但是错误不断出现。

If anyone has faced similar issue in past or is having an idea how to get this resolved, Please help me out.如果有人过去遇到过类似的问题,或者想知道如何解决这个问题,请帮助我。

Thanks you in advance.提前谢谢你。

When an AWS Lambda function is configured to use an Amazon VPC, it connects to a nominated subnet of the VPC.当 AWS Lambda function 配置为使用 Amazon VPC 时,它会连接到 VPC 的指定子网。 This allows the Lambda function to communicate with other resources inside the VPC.这允许 Lambda function 与 VPC 内的其他资源进行通信。 However, it cannot communicate with the Internet .但是,它无法与 Internet 通信 This is a problem because the Amazon SQS public endpoint lives on the Internet and the function is timing-out because it is unable to reach the Internet.这是一个问题,因为 Amazon SQS 公共终端节点位于 Internet 上,而 function 因为无法访问 Internet 而超时。

Thus, you have 3 options:因此,您有 3 个选项:

Option 1: Do not connect to a VPC选项 1:不连接到 VPC

If your Lambda function does not need to communicate with a resource in the VPC (such as the simple function you have provided above), simply do not connect it to the VPC .如果您的 Lambda function 不需要与 VPC 中的资源通信(例如您上面提供的简单的 function),只需将其连接到 VPC When a Lambda function is not connected to a VPC, it can communicate with the Internet and the Amazon SQS public endpoint.当 Lambda function连接到 VPC 时,它可以与 Internet 和 Amazon SQS 公共终端节点进行通信。

Option 2: Use a VPC Endpoint选项 2:使用 VPC 终端节点

A VPC Endpoint provides a means of accessing an AWS service without going via the Internet . VPC 终端节点提供了一种无需通过 Internet 即可访问 AWS 服务的方法。 You would configure a VPC endpoint for Amazon SQS.您将为 Amazon SQS 配置VPC 终端节点 Then, when the Lambda function wishes to connect with the SQS queue, it can access SQS via the endpoint rather than via the Internet.然后,当 Lambda function 希望与 SQS 队列连接时,它可以通过端点而不是通过 Internet 访问 SQS。 This is normally a good option if the Lambda function needs to communicate with other resources in the VPC.如果 Lambda function 需要与 VPC 中的其他资源通信,这通常是一个不错的选择。

Option 3: Use a NAT Gateway选项 3:使用 NAT 网关

If the Lambda function is configured to use a private subnet, it will be able to access the Internet if a NAT Gateway has been provisioned in a public subnet and the Route Table for the private subnet points to the NAT Gateway.如果 Lambda function 配置为使用私有子网,则如果已在公共子网中配置了 NAT 网关,并且私有子网的路由表指向 NAT 网关,它将能够访问 Internet。 This involves extra expense and is only worthwhile if there is an additional need for a NAT Gateway.这涉及额外费用,只有在需要额外的 NAT 网关时才值得。

You need to place your lambda inside your VPC then set up a VPC endpoint for SQS or NAT gateway, When you add your lambda function to a subnet, make sure you ONLY add it to the private subnets, otherwise nothing will work.您需要将 lambda 放置在您的 VPC 中,然后为 SQS 或 NAT 网关设置 VPC 端点,当您添加 lambda function 时,请确保将其添加到私有子网中,否则不会将其添加到私有子网中。

Reference参考

https://docs.aws.amazon.com/lambda/latest/dg/vpc.html https://docs.aws.amazon.com/lambda/latest/dg/vpc.html

https://aws.amazon.com/premiumsupport/knowledge-center/internet-access-lambda-function/ https://aws.amazon.com/premiumsupport/knowledge-center/internet-access-lambda-function/

I am pretty convinced that you cannot call an SQS queue from within a VPC using Lambda using an SQS endpoint.我非常确信您不能使用 SQS 端点使用 Lambda 从 VPC 中调用 SQS 队列。 I'd consider it a bug, but maybe the Lambda team did this for a reason.我认为这是一个错误,但也许 Lambda 团队这样做是有原因的。 In any case, You will get a message timeout.在任何情况下,您都会收到消息超时。 I cooked up a simple test Lambda我做了一个简单的测试 Lambda

import json
import boto3
import socket

def lambda_handler(event, context):
    print('lambda-test SQS...')
    sqsDomain='sqs.us-west-2.amazonaws.com'
    
    addr1 = socket.gethostbyname(sqsDomain)
    print('%s=%s' %(sqsDomain, addr1))
     
    print('Creating sqs client...')
    sqs = boto3.client('sqs')
    
    print('Sending Test Message...')
    response = sqs.send_message(
            QueueUrl='https://sqs.us-west-2.amazonaws.com/1234567890/testq.fifo',
            MessageBody='Test SQS Lambda!',
            MessageGroupId='test')
            
    print('SQS send response: %s' % response)

    return {
        'statusCode': 200,
        'body': json.dumps(response)
    }

I created a VPC, subnet, etc per - Configuring a Lambda function to access resources in a VPC .我创建了一个 VPC、子网等 - 配置 Lambda function 以访问 VPC 中的资源 The EC2 instance in this example has no problem invoking SQS through the private endpoint from the CLI per this tutorial.根据本教程,此示例中的 EC2 实例通过 CLI 的私有端点调用 SQS 没有问题。

If I drop my simple Lambda above into the same VPC and subnet, with SQS publishing permissions etc. and invoke the test function it will properly resolve the IP address of the SQS endpoint within the subnet, but the call will timeout (making sure your Lambda timeout is more than 60 seconds to let boto fail). If I drop my simple Lambda above into the same VPC and subnet, with SQS publishing permissions etc. and invoke the test function it will properly resolve the IP address of the SQS endpoint within the subnet, but the call will timeout (making sure your Lambda超时超过 60 秒让 boto 失败)。 Enabling boto debug logging further confirms that the IP is resolved correctly and the HTTP request to SQS times out.启用 boto 调试日志记录进一步确认 IP 已正确解析,并且对 SQS 的 HTTP 请求超时。

I didn't try this with a non-FIFO queue but as the HTTP call is failing on connection request this shouldn't matter.我没有尝试使用非 FIFO 队列,但由于 HTTP 调用在连接请求上失败,这无关紧要。 It's got to be a routing issue from the Lambda as the EC2 in the same subnet works.这一定是 Lambda 的路由问题,因为同一子网中的 EC2 可以正常工作。

I modified my simple Lambda and added an SNS endpoint and did the same test which worked.我修改了我的简单 Lambda 并添加了一个 SNS 端点并进行了同样有效的测试。 The issue issue appears to be specific to SQS best I can tell.我能说的最好的问题似乎是 SQS 特有的。

import json
import boto3
import socket

def testSqs():
    print('lambda-test SQS...')
    sqsDomain='sqs.us-west-2.amazonaws.com'
    
    addr1 = socket.gethostbyname(sqsDomain)
    print('%s=%s' %(sqsDomain, addr1))
    
    print('Creating sqs client...')
    sqs = boto3.client('sqs')
    
    print('Sending Test Message...')
    response = sqs.send_message(
            QueueUrl='https://sqs.us-west-2.amazonaws.com/1234567890/testq.fifo',
            MessageBody='Test SQS Lambda!',
            MessageGroupId='test')
            
    print('SQS send response: %s' % response)

    return {
        'statusCode': 200,
        'body': json.dumps(response)
    }
    

def testSns():
    print('lambda-test SNS...')

    print('Creating sns client...')
    sns = boto3.client('sns')
    
    print('Sending Test Message...')
    response = sns.publish(
            TopicArn='arn:aws:sns:us-west-2:1234567890:lambda-test',
            Message='Test SQS Lambda!'
            )
            
    print('SNS send response: %s' % response)

    return {
        'statusCode': 200,
        'body': json.dumps(response)
    }
    

def lambda_handler(event, context):
    #return testSqs()
    return testSns()

I think your only options are NAT (per John above), bounce your calls off a local EC2 (NAT will be simpler, cheaper, and more reliable), or use a Lambda proxy outside the VPC.我认为您唯一的选择是 NAT(根据上面的 John),将您的呼叫从本地 EC2 退回(NAT 会更简单、更便宜、更可靠),或者在 VPC 之外使用 Lambda 代理。 Which someone else suggested in a similar post.其他人在类似的帖子中建议。 You could also subscribe an SQS queue to an SNS topic (I prototyped this and it works) and route it out that way too, but that just seems silly unless you absolutely have to have SQS for some obscure reason.您还可以将 SQS 队列订阅到 SNS 主题(我对此进行了原型制作并且它可以工作)并以这种方式将其路由出去,但这似乎很愚蠢,除非您出于某种模糊的原因绝对必须拥有 SQS。

I switched to SNS.我切换到SNS。 I was just hoping to get some more experience with SQS.我只是希望获得更多有关 SQS 的经验。 Hopefully somebody can prove me wrong, but I call it a bug.希望有人可以证明我错了,但我称之为错误。

If you're using the boto3 python library in a lambda in a VPC, and it's failing to connect to an sqs queue through a vpc endpoint, you must set the endpoint_url when creating the sqs client.如果您在 VPC 的 lambda 中使用 boto3 python 库,并且无法通过 vpc 端点连接到 sqs 队列,则必须在创建 sqs 客户端时设置 endpoint_url。 Issue 1900 describes the background behind this.问题 1900描述了这背后的背景。

The solution looks like this (for an sqs vpc endpoint in us-east-1):解决方案如下所示(对于 us-east-1 中的 sqs vpc 端点):

sqs_client = boto3.client('sqs',
    endpoint_url='https://sqs.us-east-1.amazonaws.com')

Then call send_message or send_message_batch as normal.然后像往常一样调用 send_message 或 send_message_batch。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM