简体   繁体   中英

AWS S3 folder put event notification

I've written a function in Python that uploads a folder and its content to S3. Now I would like S3 to generate an event (so I can send it to a lambda function). S3 allows to generate events only at file level, in fact folders on s3 are just a visualization layer, which means that S3 has no internal representation for folders, keys with the same root are simply grouped together. That said, as for now I've come up with three approaches that revolves around the idea of a 'poison pill'.

  1. Send a special file at the end of the folder upload process, the creation of which sends an event to lambda that can open the file to read custom directives to act on. Seems that this approach is quite flexible, however it poses serious concerns security-wise (I know that ACLs are in place for this reason but I'm not quite sure if it's enough), and generates some overhead while downloading/uploading/deleting the file from/to local memory.

  2. Map an event to the target lambdas and fire it directly. The difference in approaches is simply that in this case I'm not really creating a file on S3, I'm just making S3 believe so. I would use CloudWatch to fire custom S3-object-created events with the name of the folder for lambda to pick up. This approach feels a little more hacky than the other two, plus when I did my research on the matter it seemed like it shouldn't be possible to generate "mock" events on AWS (ie Trigger S3 create event ). To my understanding however, the function put_events should do the trick.

  3. Using SQS would allow to put the folder name into an SQS task that can be later consumed by lambda. This has some advantages over the other two approaches, since SQS has now a LIFO variant that allows for exactly-once-delivery, failures reprocessing (via dead letters queue), etc, however this generates a non-trivial amount of complexity compared to the other approaches.

At this point I'm trying to opt for the most 'correct' approach, and in order to do so I'm trying to weight pros and cons to make an informed decision, which led me to some questions:

  • Is there another way I'm missing out to proceed that does not involve client notification ? (all the aforementioned approaches rely on the client sending the notification in one way or another, which is not very "cloudy")?

  • Is there a substantial difference between approaches 2 and 3, considering that both rely on sending the information in and out of a stream (CloudWatch and SQS respectively)?

I think your question nets down to "how can I trigger a Lambda function after I have uploaded a folder full of files to S3?"

Unless you have some information a priori server-side that you can use to determine when the folder upload has completed, the client is going to have to tell you.

Options I would consider:

  1. change your client to publish a message to SNS or to SQS upon the completion of uploading to S3. That message can then trigger your Lambda function.
  2. after the last file has been uploaded to folder images/dogs/ , upload a zero-sized object whose key is the same as the folder ( images/dogs/ ). This is a 'sentinel file'. Use an S3 event trigger with suffix of / to detect the upload of that 'folder' object and trigger your Lambda.

I prefer the 1st option. It achieves the end goal without resulting in extraneous S3 objects. With SNS you can also configure multiple downstream processes in response to the 'finished upload' message (a fan out) if needed.

Have you consider using the prefix option of S3 bucket event, I tested it and it worked fine. In my S3 bucket I created two folder test1 and test2. On s3 event I added prefix test1 with that in place every time put/copy operation happen on bucket lambda is trigger.

在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM