简体   繁体   中英

AWS Glue ETL Job triggered on batches of S3 Events

I have an S3 bucket that gets many files dropped in it (1000 records/min). I want to trigger a Glue ETL job on batches of these dropped files.

I have looked at using Firehose to aggregate the batches of the events, but that requires a lot of chained resources. Like S3 -> Lambda -> Firehose -> ...

What is the best way to process my data in batches?

You can use AWS Glue Job Triggers which will allow you to run the glue job at scheduled intervals, rather than as an S3 event trigger?

Are you processing streaming data? Don't see a use case / purpose for Firehose, with your limited information.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM