简体   繁体   English

从 AWS Lambda 到 Athena 的同步调用

[英]Synchronous call from AWS Lambda to Athena

I am looking to query data in my S3 buckets using Athena from my AWS Lambda.我希望使用 AWS Lambda 中的 Athena 查询我的 S3 存储桶中的数据。 When I looked at some of the examples the call from Lambda to Athena seems to be asynchronous.当我查看一些示例时,从 Lambda 到 Athena 的调用似乎是异步的。 The Lambda makes a call to Athena and waits for Athena to write the results to S3 bucket. Lambda 调用 Athena 并等待 Athena 将结果写入 S3 存储桶。 Is there a way to directly retrieve the response instead of having to write it to a S3 bucket?有没有办法直接检索响应而不必将其写入 S3 存储桶?

There is not. 那没有。 Athena will always write the results to S3 (even with the new semi-private "streaming" API that is used by the JDBC driver). Athena将始终将结果写入S3(即使使用JDBC驱动程序使用的新的半私有“流”API)。 The only way to know when an Athena query is completed is to poll using the GetQueryExecution API call. 了解Athena查询何时完成的唯一方法是使用GetQueryExecution API调用进行轮询。 Even seemingly synchronous APIs like the JDBC driver use this method internally. 即使看似同步的API(如JDBC驱动程序)也会在内部使用此方法。

However, there is no need to read the response from S3, there is also the GetQueryResults API call that returns the result along with type information. 但是,不需要从S3读取响应,还有GetQueryResults API调用,它返回结果以及类型信息。 If there are less that 1000 rows in the response or performance is not the top priority it's a better way to retrieve the results than reading the CSV file from S3. 如果响应中的行数少于1000行,或者性能不是最高优先级,则检索结果比从S3读取CSV文件更好。

If you're using Athena from Lambda my suggestion is to look at Step Functions. 如果您正在使用Lambda的Athena,我的建议是查看Step Functions。 Unless your Athena queries never run more than a few seconds you can save a lot of money by building a simple state machine that executes the query. 除非您的Athena查询从不运行超过几秒钟,否则您可以通过构建执行查询的简单状态机来节省大量资金。 You can find a good blueprint in the job poller sample project . 您可以在作业轮询示例项目中找到一个好的蓝图。

AwsWrangler provides a synchronous interface for retrieving athena results and returning in memory. AwsWrangler 提供了一个同步接口,用于检索 athena 结果并在 memory 中返回。 It utilizes a few different strategies for this, depending on which options are selected.它为此使用了几种不同的策略,具体取决于选择的选项。

https://aws-sdk-pandas.readthedocs.io/en/stable/tutorials/006%20-%20Amazon%20Athena.html https://aws-sdk-pandas.readthedocs.io/en/stable/tutorials/006%20-%20Amazon%20Athena.html

pip install awswrangler

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM