简体   繁体   English

AWS Lambda和S3:将s3对象路径传递到图像处理功能

[英]AWS Lambda and S3: passing s3 object path to image process function

My intention is to have a large image stored on my S3 server and then get a lambda function to read/process the file and save the resulting output(s). 我的意图是将大图像存储在我的S3服务器上,然后获得一个lambda函数来读取/处理文件并保存生成的输出。 I'm using a package called python-bioformats to work with a proprietary image file (which is basically a whole bunch of tiffs stacked together). 我正在使用一个名为python-bioformats的软件包来处理专有的图像文件(基本上是一堆tiff堆叠在一起)。 When I use 当我使用

def lambda_handler(event, context):

    import boto3

    key = event['Records'][0]['s3']['object']['key'].encode("utf-8")
    bucket = 'bucketname'

    s3 = boto3.resource('s3')
    imageobj = s3.Object(bucket, key).get()['Body'].read()

    bioformats.get_omexml_metadata(imageobj)

I have a feeling that the lambda function tries to download the entire file (5GB) when making imageobj. 我感觉到lambda函数在制作imageobj时会尝试下载整个文件(5GB)。 Is there a way I can just get the second function (which takes a filepath as argument) to refer to the s3 object in a filepath-like manner? 有没有一种方法可以让我获得第二个函数(将文件路径作为参数)以类似于文件路径的方式引用s3对象? I'd also like to not expose the s3 bucket/object publicly, so doing this server-side would be ideal. 我也不想公开公开s3存储桶/对象,因此在服务器端进行此操作将是理想的选择。

If your bioformats.get_omexml_metadata() function requires a filepath as an argument, then you will need to have the object downloaded before calling the function. 如果您的bioformats.get_omexml_metadata()函数需要文件路径作为参数,那么在调用该函数之前,需要先下载对象。

This could be a problem in an AWS Lambda function because there is a 500MB limit on available disk space (and only in /tmp/ ). 这可能是AWS Lambda函数中的问题,因为可用磁盘空间有500MB的限制(并且仅在/tmp/ )。

If the data can instead be processed as a stream, you could read the data as it is required without saving to disk first. 如果可以将数据作为流处理,则可以按需读取数据,而无需先保存到磁盘。 However, the python-bioformats documentation does not show this as an option. 但是, python-bioformats文档未将此显示为选项。 In fact, I would be surprised if your above code works, given that it is expecting a path while imageobj is the contents of the file. 实际上,如果您的上述代码有效,我会感到惊讶,因为它期望路径,imageobj是文件的内容

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM