[英]Loading AWS Glue S3 Source Data
I have a use case where AWS Glue is a good fit for data transformation.我有一个用例,其中 AWS Glue 非常适合数据转换。
However, the source file for this transformation job is retrieved via a HTTPs call which can take 45 mins to return.但是,此转换作业的源文件是通过 HTTPs 调用检索的,该调用可能需要 45 分钟才能返回。
What is the best approach to load this data to S3 and then sftp the glue output once completed?将此数据加载到 S3 的最佳方法是什么,然后在完成后 sftp 胶水 output?
This job needs to be both scheduled and run on demand.此作业需要安排并按需运行。
I don't think there's a way of loading the data directly from Glue for HTTP/s now.我认为现在没有办法直接从 Glue for HTTP/s 加载数据。
You can create a lambda or a EC2 instance with a service that extracts that source file and puts that file into an s3 bucket.您可以创建一个 lambda 或 EC2 实例,该实例具有提取该源文件并将该文件放入 s3 存储桶的服务。
I would suggest you to do this:我建议你这样做:
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.