简体   繁体   中英

Synchronous call from AWS Lambda to Athena

I am looking to query data in my S3 buckets using Athena from my AWS Lambda. When I looked at some of the examples the call from Lambda to Athena seems to be asynchronous. The Lambda makes a call to Athena and waits for Athena to write the results to S3 bucket. Is there a way to directly retrieve the response instead of having to write it to a S3 bucket?

There is not. Athena will always write the results to S3 (even with the new semi-private "streaming" API that is used by the JDBC driver). The only way to know when an Athena query is completed is to poll using the GetQueryExecution API call. Even seemingly synchronous APIs like the JDBC driver use this method internally.

However, there is no need to read the response from S3, there is also the GetQueryResults API call that returns the result along with type information. If there are less that 1000 rows in the response or performance is not the top priority it's a better way to retrieve the results than reading the CSV file from S3.

If you're using Athena from Lambda my suggestion is to look at Step Functions. Unless your Athena queries never run more than a few seconds you can save a lot of money by building a simple state machine that executes the query. You can find a good blueprint in the job poller sample project .

AwsWrangler provides a synchronous interface for retrieving athena results and returning in memory. It utilizes a few different strategies for this, depending on which options are selected.

https://aws-sdk-pandas.readthedocs.io/en/stable/tutorials/006%20-%20Amazon%20Athena.html

pip install awswrangler

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM