简体   繁体   中英

Python Multithreading a for loop with limited Threads

i am just learning Python and dont have much expierence with Multithreading. I am trying to send some json via the Requests session.post Method. This is called in the function at the bottem of the many for loops i need to run through the dictionary.

Is there a way to let this run in paralell?

I also have to limit my numbers of Threads, otherwise the post calls get blocked because they are to fast after each other. Help would be much appreciated.

def doWork(session, List, RefHashList):
    for itemRefHash in RefHashList:
        for equipment in res['Response']['data']['items']:
            if equipment['itemHash'] == itemRefHash:
                if equipment['characterIndex'] != 0:
                    SendJsonViaSession(session, getCharacterIdFromIndex(res, equipment['characterIndex']), itemRefHash, equipment['quantity'])

First, structuring your code differently might improve the speed without the added complexity of threading.

def doWork(session, res, RefHashList):
    for equipment in res['Response']['data']['items']:
        i = equipment['itemHash']
        k = equipment['characterIndex']
        if i in RefHashList and k != 0:
            SendJsonViaSession(session, getCharacterIdFromIndex(res, k), i, equipment['quantity'])

To start with, we will look up equipment['itemHash'] and equipment['characterIndex'] only once.

Instead of explicitly looping over RefHashList , you could use the in operator. This moves the loop into the Python virtual machine, which is faster.

And instead of a nested if -conditional, you could use a single conditional using and .

Note: I have removed the unused parameter List , and replaced it with res . It is generally good practice to write functions that only act on parameters that they are given, not global variables.

Second, how much extra performance do you need? How much time is there on average between the SendJsonViaSession calls, and how small can this this time become before calls get blocked? If the difference between those numbers is small, it is probably not worth to implement a threaded sender.

Third, a design feature of the standard Python implementation is that only one thread at a time can be executing Python bytecode. So it is not certain that threading will improve performance.

Edit:

There are several ways to run stuff in parallel in Python. There is multiprocessing.Pool which uses processes, and multiprocessing.dummy.ThreadPool which uses threads. And from Python 3.2 onwards there is concurrent.futures , which can use processes or threads.

The thing is, neither of them has rate limiting. So you could get blocked for making too many calls. Every time you call SendJsonViaSession you'd have to save the current time somehow so that all processes or threads can use it. And before every call, you would have to read that time and wait if it is too close to the last call.

Edit2:

If a call to SendJsonViaSession only takes 0.3 seconds, you should be able to do 3 calls/second sequentially. But your code only does 1 call/second. This implies that the speed restriction is somewhere else. You'd have to profile your code to see where the problem lies.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM