简体   繁体   中英

1 million job in every minute using python-eventlet

Use Case:

  1. Read data from one server
  2. Manipulate on my server
  3. Post data to other server

But need throughput 1 million per minute.

More explanation:-

Lets assume there are 10000 customers and for a customer I need to call 5 API and manipulate the data in response and after manipulation it will create about 30 API. And I want to post data to other server.

(Assumption: server from which getting data for a API call it takes 250 ms and server on which I am posting data is taking 350ms to POST data for a API call.)

Pseudo Code:

In every minute For each customers( there are 10000 customers):


Fetch data from first_server_for_first_service
Fetch data from first_server_for_second_service
Fetch data from first_server_for_third_service
Fetch data from first_server_for_fourth_service
Fetch data from first_server_for_fifth_service

Manipulate data of first_service
Manipulate data of second_service
Manipulate data of third_service
Manipulate data of fourth_service
Manipulate data of fifth_service

post data to second_server_for_first_service_1_type
post data to second_server_for_first_service_2_type
post data to second_server_for_first_service_3_type
post data to second_server_for_first_service_4_type
post data to second_server_for_first_service_5_type
post data to second_server_for_first_service_6_type
post data to second_server_for_second_service_1_type
post data to second_server_for_second_service_2_type
post data to second_server_for_second_service_3_type
post data to second_server_for_second_service_4_type
post data to second_server_for_second_service_5_type
post data to second_server_for_second_service_6_type
post data to second_server_for_third_service_1_type
post data to second_server_for_third_service_2_type
post data to second_server_for_third_service_3_type
post data to second_server_for_third_service_4_type
post data to second_server_for_third_service_5_type
post data to second_server_for_third_service_6_type
post data to second_server_for_fourth_service_1_type
post data to second_server_for_fourth_service_2_type
post data to second_server_for_fourth_service_3_type
post data to second_server_for_fourth_service_4_type
post data to second_server_for_fourth_service_5_type
post data to second_server_for_fourth_service_6_type
post data to second_server_for_fifth_service_1_type
post data to second_server_for_fifth_service_2_type
post data to second_server_for_fifth_service_3_type
post data to second_server_for_fifth_service_4_type
post data to second_server_for_fifth_service_5_type
post data to second_server_for_fifth_service_6_type

How we can write the code through Eventlet so it can execute so many tasks in parallel. Or will eventlet able to execute these many tasks ?

Please reply.

Short answer: this is a hard requirement. If you absolutely can not reduce load, I strongly suggest to look at fast languages with built in concurrency support: Go, Haskell, Ocaml. PyPy is also supposed to help in this case.

10000 * 35 = 350K API calls per minute. ~6K per second. Assuming 350ms response time, you would need ~2100 connections to upstream and downstream services combined to keep up. Eventlet can host this number of greenthreads no sweat.

But then you have big problems with CPU. Smallest eventlet overhead I measured on old Core 2 Duo box is ~25µs. And you only have 166µs (1 second / 6K ops) for each call. Good luck doing useful data processing in 140µs in Python. Good news is that you should be able to process each, say, 1000 clients in separate process and spread CPU load to 10 cores.

It would not take any particularly interesting code to solve this task with Eventlet. Example code below is probably the simplest possible way. Your API calls must be able to reuse existing socket connections. You may want to add concurrency or throughput limits using queues or semaphores.

clients = ['client1', 'client2', ...] # 10K


def service1(request):
    data1 = API.get()
    data2 = process(data1)
    eventlet.spawn(API.post_type_1, data2)
    eventlet.spawn(API.post_type_2, data2)
    # ...


def tick():
    now = time.time()
    for client in clients:
        # some context object
        request = (client, now)

        eventlet.spawn(service1, request)
        eventlet.spawn(service2, request)
        eventlet.spawn(service3, request)
        eventlet.spawn(service4, request)
        eventlet.spawn(service5, request)


def main():
    while True:
        tick()
        eventlet.sleep(60)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM