简体   繁体   English

无法在没有任何 HTTP 错误的情况下从 AWS Lambda 中的 S3 下载文件

[英]Can't download file from S3 in AWS Lambda without any HTTP error

This is my code in AWS lambda:这是我在 AWS lambda 中的代码:

import boto3
def worker_handler(event, context):

s3 = boto3.resource('s3')
s3.meta.client.download_file('s3-bucket-with-script','scripts/HelloWorld.sh', '/tmp/hw.sh')
print "Connecting to "

I just want to download a file stored in S3, but when I start the code, the program just run until timeout and print nothing on.我只想下载存储在 S3 中的文件,但是当我启动代码时,程序只会运行直到超时并且不打印任何内容。 This is the Logs file这是日志文件

START RequestId: 8b9b86dd-4d40-11e6-b6c4-afcc5006f010 Version: $LATEST
END RequestId: 8b9b86dd-4d40-11e6-b6c4-afcc5006f010
REPORT RequestId: 8b9b86dd-4d40-11e6-b6c4-afcc5006f010  Duration: 300000.12 ms  Billed Duration: 300000 ms  Memory Size: 128 MB Max Memory Used: 28 MB  
2016-07-18T23:42:10.273Z 8b9b86dd-4d40-11e6-b6c4-afcc5006f010 Task timed out after 300.00 seconds

I have this role in the this Lambda function, it shows that I have the permission to get file from S3
{
"Version": "2012-10-17",
"Statement": [
    {
        "Effect": "Allow",
        "Action": [
            "ec2:CreateNetworkInterface",
            "ec2:DescribeNetworkInterfaces",
            "ec2:DeleteNetworkInterface"
        ],
        "Resource": "*"
    },
    {
        "Effect": "Allow",
        "Action": [
            "logs:CreateLogGroup",
            "logs:CreateLogStream",
            "logs:PutLogEvents"
        ],
        "Resource": "arn:aws:logs:*:*:*"
    },
    {
        "Sid": "AllowPublicRead",
        "Effect": "Allow",
        "Action": [
            "s3:GetObject"
        ],
        "Resource": [
            "arn:aws:s3:::*"
        ]
    }
]
}

Is there any other set up I missed?有没有我错过的其他设置? Or anyway I can continue this program?或者无论如何我可以继续这个程序? Thanks in advanced.提前致谢。

Per the docs your code should be a little different:根据文档,您的代码应该略有不同:

    import boto3

    # Get the service client
    s3 = boto3.client('s3')

    # Download object at bucket-name with key-name to tmp.txt
    s3.download_file("bucket-name", "key-name", "tmp.txt")

Also, note that Lambda has a ephemeral file structure, meaning downloading the file, does nothing really.另请注意,Lambda 具有临时文件结构,这意味着下载文件实际上什么也不做。 You just downloaded it and then the Lambda shut down and ceased to exist, you need to send it somewhere after you download it to Lambda if you want to keep it.您刚刚下载了它,然后 Lambda 关闭并不复存在,如果您想保留它,则需要在将其下载到 Lambda 后将其发送到某个地方。

Also, you may need to tweak your timeout settings to be higher.此外,您可能需要将超时设置调整得更高。

As you mentioned a timeout, I would check the network configuration.正如您提到的超时,我会检查网络配置。 If you are going through a VPC, this may be caused by the lack of route to the internet.如果您正在通过 VPC,这可能是由于缺乏到 Internet 的路由造成的。 This can be solved using a NAT Gateway or S3 VPC endpoint.这可以使用 NAT 网关或 S3 VPC 端点来解决。 The video below explains the configuration required.下面的视频解释了所需的配置。

Introducing VPC Support for AWS Lambda 引入对 AWS Lambda 的 VPC 支持

As indicated in another answer you may need a NAT Gateway or a S3 VPC endpoint.如另一个答案所示,您可能需要 NAT 网关或 S3 VPC 端点。 I needed it because my Lambda was in a VPC so it could access RDS.我需要它,因为我的 Lambda 位于 VPC 中,因此它可以访问 RDS。 I started going through the trouble of setting up a NAT Gateway until I realized that a NAT Gateway is currently $0.045 per hour, or about $1 ($1.08) per day, which is way more than I wanted to spend.我开始遇到设置 NAT 网关的麻烦,直到我意识到 NAT 网关目前每小时 0.045 美元,或每天约 1 美元(1.08 美元),这远远超过我想要的花费。

Then I needed to consider a S3 VPC endpoint.然后我需要考虑 S3 VPC 端点。 This sounded like setting up another VPC but it is not a VPC, it is a VPC endpoint .这听起来像是设置另一个 VPC,但它不是 VPC,而是 VPC端点 If you go into the VPC section there is a "endpoint" section (on the left) along with subnets, routes, NAT gateways, etc. For all the complexity (in my opinion) of setting up the NAT gateway, the endpoint was surprisingly simple.如果您进入 VPC 部分,则会有一个“端点”部分(在左侧)以及子网、路由、NAT 网关等。对于设置 NAT 网关的所有复杂性(在我看来),端点令人惊讶简单的。

The only tricky part was selecting the service.唯一棘手的部分是选择服务。 You'll notice the service names are tied to the region you are in. For example, mine is "com.amazonaws.us-east-2.s3"您会注意到服务名称与您所在的区域相关联。例如,我的是“com.amazonaws.us-east-2.s3”

But then you may notice you have two options, a "gateway" and an "interface".但是随后您可能会注意到您有两个选项,一个“网关”和一个“接口”。 On Reddit someone claimed that they charge for interfaces but not gateways, so I went with gateway and things seem to work.在 Reddit 上,有人声称他们对接口收费而不是网关收费,所以我选择了网关,一切似乎都有效。

https://www.reddit.com/r/aws/comments/a6yppu/eli5_what_is_the_difference_between_interface/ https://www.reddit.com/r/aws/comments/a6yppu/eli5_what_is_the_difference_between_interface/

If you don't trust that Reddit user, I later found that AWS currently says this: "Note: To avoid the NAT Gateway Data Processing charge in this example, you could setup a Gateway Type VPC endpoint and route the traffic to/from S3 through the VPC endpoint instead of going through the NAT Gateway. There is no data processing or hourly charges for using Gateway Type VPC endpoints. For details on how to use VPC endpoints, please visit VPC Endpoints Documentation."如果您不信任那个 Reddit 用户,我后来发现 AWS 当前是这样说的:“注意:为了避免本示例中的 NAT 网关数据处理费用,您可以设置网关类型 VPC 端点并将流量路由到/从 S3 “通过 VPC 端点而不是通过 NAT 网关。使用网关类型 VPC 端点没有数据处理或小时费用。有关如何使用 VPC 端点的详细信息,请访问 VPC 端点文档。”

https://aws.amazon.com/vpc/pricing/ https://aws.amazon.com/vpc/pricing/

Note, I also updated the pathing type per an answer in this other question, but I'm not sure that really mattered.请注意,我还根据另一个问题的答案更新了路径类型,但我不确定这是否真的很重要。

https://stackoverflow.com/a/44478894/764365 https://stackoverflow.com/a/44478894/764365

did you check if your time out was set correctly?你有没有检查你的超时设置是否正确? I had the same issue, and it was timing out since my default value was set to 3 seconds and the file would take longer than that to download.我遇到了同样的问题,它超时了,因为我的默认值设置为 3 秒,并且文件需要比下载更长的时间。

here is where you set your timeout setting:这里是您设置超时设置的地方:

在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM