[英]Run ETL python script in AWS triggered by S3
I am new with AWS and don't know how to do the following. 我是AWS的新手,不知道如何执行以下操作。 When I put an object in S3 I want to launch a python script that does some transformations and returns it to another path in S3. 当我在S3中放置一个对象时,我想启动一个python脚本,该脚本进行一些转换并将其返回到S3中的另一个路径。 I've tried a lambda function but the process takes more than 300 seconds. 我尝试了lambda函数,但是该过程需要300秒钟以上。 I've also tried it with a Glue job but I don't know how to trigger it when I put the file in S3. 我也尝试了Glue作业,但是当我将文件放入S3时我不知道如何触发它。
Does anyone know how to do it? 有人知道怎么做吗? Maybe I'm using the wrong AWS tools. 也许我使用了错误的AWS工具。
One option would be to use SQS : 一种选择是使用SQS :
Can you break up the Python processing into smaller steps? 您可以将Python处理分成较小的步骤吗? I'd definitely recommend that you use Lambda instead of managing EC2 if you can get your code to run within the Lambda restrictions. 如果您可以让代码在Lambda限制内运行,我绝对建议您使用Lambda而不是管理EC2。
The simple solution for your problem is here: Since you've already mentioned that you have AWS Glue job working to do this operation. 针对您的问题的简单解决方案如下:既然您已经提到过您有AWS Glue作业正在执行此操作。 And all you don't know is how to trigger glue job when file placed in s3, I am answering to that question. 而且,您所不知道的是将文件放置在s3中时如何触发粘合作业,我正在回答这个问题。 You can write an AWS lambda using boto3 module which can be triggered based up on the s3 event and have setup glue.start_job_run command in your lambda function. 您可以使用boto3模块编写一个AWS lambda,该模块可以根据s3事件触发,并在lambda函数中设置setup.start_job_run命令。
response = client.start_job_run(
JobName='string')
https://boto3.readthedocs.io/en/latest/reference/services/glue.html#Glue.Client.start_job_run https://boto3.readthedocs.io/en/latest/reference/services/glue.html#Glue.Client.start_job_run
Note:: I strongly believe Glue is the right tool rather than lambda for your requirement that you mentioned in question, because AWS lambda have time out limitation. 注意:我坚信Glue是正确的工具,而不是lambda,因为您提到的要求是您需要的,因为AWS lambda有超时限制。 It will get timeout after 300 seconds. 300秒后将超时。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.