简体繁体中英

How can I schedule or queue api calls to maintain rate limit?

原文 2016-06-19 20:40:20 3 1 python/ amazon-web-services/ scheduled-tasks/ celery/ task-queue

I am trying to continuously crawl a large amount of information from a site using the REST api they provide. I have following constraints-

Stay within api limit (5 calls/sec)
Utilising the full limit (making exactly 5 calls per second, 5*60 calls per minute)
Each call will be with different parameters (params will be fetched from db or in-memory cache)
Calls will be made from AWS EC2 (or GAE) and processed data will be stored in AWS RDS/DynamoDB

For now I am just using a scheduled task that runs a python script every minute- and the script makes 10-20 api calls-> processes response-> stores data to DB. I want to scale this procedure (make 5*60= 300 calls per minute) and make it manageable via code (pushing new tasks, pause/resuming them easily, monitoring failures, changing call frequency).

My question is- what are the best available tools to achieve this? Any suggestion/guidance/link is appreciated.

I do know the names of some task queuing frameworks like Celery/RabbitMQ/Redis, but I do not know much about them. However I am wiling to learn one or each of those if these are the best tools to solve my problem, want to hear from SO veterans before jumping in ☺
Also please let me know if there's any other AWS service I should look to use (SQS or AWS Data Pipeline?) to make any step easier.

1 answers

You needn't add an external dependency just for rate-limiting, as your use case is rather straightforward.

I can think of two options:

Modify the script (that currently wakes up every minute and makes 10-20 API calls) to wake up every second and make 5 calls (sequentially or in parallel).
- In your current design, your API calls might not be properly distributed across 1 minute, ie you might be making all your 10-20 calls in the first, say, 20 seconds.
- If you change that script to run every second, your API call rate will be more balanced.
Change your Python script to a long running daemon, and use a Rate Limiter library, such as this . You can configure the latter to make 1 call per x seconds.

Python API Rate Limiting - How to Limit API Calls Globally

How can I limit API calls in multithreaded program in Python 3?

Rate Limit API Calls to Shopify API with Django on Google App Engine

How do I avoid rate limit with tweepy Twitter API?

Using Python threads to make thousands of calls to a slow API with a rate limit

How can I limit my sprite update rate?

Can you rate limit an Azure Function or output of a Storage Queue?

How to schedule API calls in a "for" loop with a counter?

How to DRY the handling of Twitter API rate limit

How to resolve twitter api rate limit?

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Python API Rate Limiting - How to Limit API Calls Globally How can I limit API calls in multithreaded program in Python 3? Rate Limit API Calls to Shopify API with Django on Google App Engine How do I avoid rate limit with tweepy Twitter API? Using Python threads to make thousands of calls to a slow API with a rate limit How can I limit my sprite update rate? Can you rate limit an Azure Function or output of a Storage Queue? How to schedule API calls in a "for" loop with a counter? How to DRY the handling of Twitter API rate limit How to resolve twitter api rate limit?

Related Tags

How can I schedule or queue api calls to maintain rate limit?

Question

1 answers

solution1 1 ACCPTED 2016-06-20 03:41:32

solution1
1 ACCPTED 2016-06-20 03:41:32