简体   繁体   中英

python asyncio task not executing in Parallel

I am creating async task so they can execute in Parallel like this

 for symbol in config.symbol_list:
                    tasks.append(asyncio.ensure_future(get_today_Data_async(symbol), loop=loop))
                loop.run_until_complete(asyncio.wait(tasks))

This is the task which i want to execute in Parallel

async def get_today_Data_async(symbol):

    periodType = 'day'
    period = 1
    frequencyType = 'minute'
    frequency = '1'
    use_last10_Min = False
    logging.info(f'Updating data {symbol} started...')
    try:
        logging.info(f'thread id - {threading.get_ident()} getting market data {symbol} periodType {periodType} period {period} frequencyType {frequencyType} frequency {frequency}')

        est = pytz.timezone('US/Eastern')
        if use_last10_Min:
            startDate = (datetime.datetime.now()- datetime.timedelta(minutes=10)).astimezone(tz=est).timestamp()
        else:
            startDate =(datetime.datetime.now().replace(hour=0, minute=0, second=0, microsecond=0)).astimezone(tz=est).timestamp()
        endDate = (datetime.datetime.now()+datetime.timedelta(hours=48)).astimezone(tz=est).timestamp()
        endDate = str(endDate).split('.')[0] + '000'
        startDate = str(startDate).split('.')[0] + '000'

        reqDict = {'apikey': '' + config.client_id + '@AMER.OAUTHAP','endDate': endDate, 'frequencyType': frequencyType,'frequency': frequency,
                   'startDate': startDate, 'needExtendedHoursData': usePreMarket}

        header = {'Authorization': 'Bearer ' + config.token['access_token'] + '', 'content-type': 'application/json'}
        logging.info(f"thread id - {threading.get_ident()} datetime check {symbol} {datetime.datetime.now()}   {reqDict}")
        with await tlock:
            resp = requests.get("https://api.tdameritrade.com/v1/marketdata/" + symbol + "/pricehistory", params=reqDict)
        logging.info(f'thread id - {threading.get_ident()} datetime check {symbol} {datetime.datetime.now()} {resp.status_code}')
        if resp.status_code == 200 and not resp.json()['empty']:
            candles = resp.json()['candles']
            logging.info(f"symbol candel {symbol} {frequencyType} {frequency} {period} {get_one_hour(resp.json()['candles'])}")
            if not usePreMarket:
                newcandles = []
                EST = pytz.timezone('us/eastern')
                time_ist_end = datetime.datetime.now(EST).replace(hour=16, minute=00, second=00)
                time_ist_start = time_ist_end.replace(hour=9, minute=30, second=00)
                for x in candles:
                    tmp_date = datetime.datetime.fromtimestamp((x.get('datetime') / 1000), tz=pytz.timezone('US/Eastern'))
                    if tmp_date > time_ist_start and tmp_date < time_ist_end:
                        newcandles.append(x)
                if len(newcandles) > 0:
                    process_price(symbol,newcandles)
            else:
                if len(candles) > 0:
                    process_price(symbol, candles)

        logging.info(f" symbol - {symbol} status code {resp.status_code} resp {resp.text}")

    except Exception as e:
        traceback.print_exc()
        logging.error(f'Error in getting price {e}')
    logging.info(f'Updating data {symbol} completed...')

But task is executing sequentially as producing following output

2020-10-14 20:22:43,293  - root - get_today_Data_async - 398 - INFO - Updating data AAPL started...
2020-10-14 20:22:45,066  - root - get_today_Data_async - 442 - INFO - Updating data AAPL completed...
2020-10-14 20:22:45,066  - root - get_today_Data_async - 398 - INFO - Updating data MSFT started...
2020-10-14 20:22:46,301  - root - get_today_Data_async - 442 - INFO - Updating data MSFT completed...
2020-10-14 20:22:46,301  - root - get_today_Data_async - 398 - INFO - Updating data AMZN started...
2020-10-14 20:22:47,573  - root - get_today_Data_async - 442 - INFO - Updating data AMZN completed...
2020-10-14 20:22:47,573  - root - get_today_Data_async - 398 - INFO - Updating data FB started...
2020-10-14 20:22:48,907  - root - get_today_Data_async - 442 - INFO - Updating data FB completed...
2020-10-14 20:22:48,907  - root - get_today_Data_async - 398 - INFO - Updating data GOOGL started...
2020-10-14 20:22:51,266  - root - get_today_Data_async - 442 - INFO - Updating data GOOGL completed...
2020-10-14 20:22:51,266  - root - get_today_Data_async - 398 - INFO - Updating data GOOG started...
2020-10-14 20:22:52,585  - root - get_today_Data_async - 442 - INFO - Updating data GOOG completed...
2020-10-14 20:22:52,585  - root - get_today_Data_async - 398 - INFO - Updating data JNJ started...
2020-10-14 20:22:54,041  - root - get_today_Data_async - 442 - INFO - Updating data JNJ completed...
2020-10-14 20:22:54,041  - root - get_today_Data_async - 398 - INFO - Updating data PG started...
2020-10-14 20:22:55,275  - root - get_today_Data_async - 442 - INFO - Updating data PG completed...
2020-10-14 20:22:55,275  - root - get_today_Data_async - 398 - INFO - Updating data V started...
2020-10-14 20:22:56,563  - root - get_today_Data_async - 442 - INFO - Updating data V completed..

It means task are executing in sequence. There are around 500 symbols. Can you please help me out so i can execute task in Parallel

In python there is theoretically no parallel execution at any given time.

Python's global interpreter lock (GIL) is a complex mechanism that I will not explain here, you can read about it if you would like, but it prevent python code to run in two different threads at the same time .

So why still use threading / parallel processing? In Python, solving I/O (input output) problems, is a classic fit to parallel processing, il explain with an example. If you have a code that make HTTP requests, because network data transfer is WAY slower than cpu processing, to make your code most efficient, you would rather make the request at one thread, than instead of making the program stuck and wait for response, continue make requests with other threads, than for each returning response, take care of the output you got from that response.

That's why in Python, a lot of problems probably should not be multithreaded, while in other languages it has some benefits.

One way you can achieve true parallel processing with python is with the multiprocessing module. But keep in mind that it will have more RAM usage than a normal python execution because you have multiple identical stacks in memory, and it will not necessarily will be quicker because it takes time to open and close processes.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM