简体   繁体   English

python asyncio任务未并行执行

[英]python asyncio task not executing in Parallel

I am creating async task so they can execute in Parallel like this我正在创建异步任务,以便它们可以像这样并行执行

 for symbol in config.symbol_list:
                    tasks.append(asyncio.ensure_future(get_today_Data_async(symbol), loop=loop))
                loop.run_until_complete(asyncio.wait(tasks))

This is the task which i want to execute in Parallel这是我想并行执行的任务

async def get_today_Data_async(symbol):

    periodType = 'day'
    period = 1
    frequencyType = 'minute'
    frequency = '1'
    use_last10_Min = False
    logging.info(f'Updating data {symbol} started...')
    try:
        logging.info(f'thread id - {threading.get_ident()} getting market data {symbol} periodType {periodType} period {period} frequencyType {frequencyType} frequency {frequency}')

        est = pytz.timezone('US/Eastern')
        if use_last10_Min:
            startDate = (datetime.datetime.now()- datetime.timedelta(minutes=10)).astimezone(tz=est).timestamp()
        else:
            startDate =(datetime.datetime.now().replace(hour=0, minute=0, second=0, microsecond=0)).astimezone(tz=est).timestamp()
        endDate = (datetime.datetime.now()+datetime.timedelta(hours=48)).astimezone(tz=est).timestamp()
        endDate = str(endDate).split('.')[0] + '000'
        startDate = str(startDate).split('.')[0] + '000'

        reqDict = {'apikey': '' + config.client_id + '@AMER.OAUTHAP','endDate': endDate, 'frequencyType': frequencyType,'frequency': frequency,
                   'startDate': startDate, 'needExtendedHoursData': usePreMarket}

        header = {'Authorization': 'Bearer ' + config.token['access_token'] + '', 'content-type': 'application/json'}
        logging.info(f"thread id - {threading.get_ident()} datetime check {symbol} {datetime.datetime.now()}   {reqDict}")
        with await tlock:
            resp = requests.get("https://api.tdameritrade.com/v1/marketdata/" + symbol + "/pricehistory", params=reqDict)
        logging.info(f'thread id - {threading.get_ident()} datetime check {symbol} {datetime.datetime.now()} {resp.status_code}')
        if resp.status_code == 200 and not resp.json()['empty']:
            candles = resp.json()['candles']
            logging.info(f"symbol candel {symbol} {frequencyType} {frequency} {period} {get_one_hour(resp.json()['candles'])}")
            if not usePreMarket:
                newcandles = []
                EST = pytz.timezone('us/eastern')
                time_ist_end = datetime.datetime.now(EST).replace(hour=16, minute=00, second=00)
                time_ist_start = time_ist_end.replace(hour=9, minute=30, second=00)
                for x in candles:
                    tmp_date = datetime.datetime.fromtimestamp((x.get('datetime') / 1000), tz=pytz.timezone('US/Eastern'))
                    if tmp_date > time_ist_start and tmp_date < time_ist_end:
                        newcandles.append(x)
                if len(newcandles) > 0:
                    process_price(symbol,newcandles)
            else:
                if len(candles) > 0:
                    process_price(symbol, candles)

        logging.info(f" symbol - {symbol} status code {resp.status_code} resp {resp.text}")

    except Exception as e:
        traceback.print_exc()
        logging.error(f'Error in getting price {e}')
    logging.info(f'Updating data {symbol} completed...')

But task is executing sequentially as producing following output但是任务按顺序执行,产生以下输出

2020-10-14 20:22:43,293  - root - get_today_Data_async - 398 - INFO - Updating data AAPL started...
2020-10-14 20:22:45,066  - root - get_today_Data_async - 442 - INFO - Updating data AAPL completed...
2020-10-14 20:22:45,066  - root - get_today_Data_async - 398 - INFO - Updating data MSFT started...
2020-10-14 20:22:46,301  - root - get_today_Data_async - 442 - INFO - Updating data MSFT completed...
2020-10-14 20:22:46,301  - root - get_today_Data_async - 398 - INFO - Updating data AMZN started...
2020-10-14 20:22:47,573  - root - get_today_Data_async - 442 - INFO - Updating data AMZN completed...
2020-10-14 20:22:47,573  - root - get_today_Data_async - 398 - INFO - Updating data FB started...
2020-10-14 20:22:48,907  - root - get_today_Data_async - 442 - INFO - Updating data FB completed...
2020-10-14 20:22:48,907  - root - get_today_Data_async - 398 - INFO - Updating data GOOGL started...
2020-10-14 20:22:51,266  - root - get_today_Data_async - 442 - INFO - Updating data GOOGL completed...
2020-10-14 20:22:51,266  - root - get_today_Data_async - 398 - INFO - Updating data GOOG started...
2020-10-14 20:22:52,585  - root - get_today_Data_async - 442 - INFO - Updating data GOOG completed...
2020-10-14 20:22:52,585  - root - get_today_Data_async - 398 - INFO - Updating data JNJ started...
2020-10-14 20:22:54,041  - root - get_today_Data_async - 442 - INFO - Updating data JNJ completed...
2020-10-14 20:22:54,041  - root - get_today_Data_async - 398 - INFO - Updating data PG started...
2020-10-14 20:22:55,275  - root - get_today_Data_async - 442 - INFO - Updating data PG completed...
2020-10-14 20:22:55,275  - root - get_today_Data_async - 398 - INFO - Updating data V started...
2020-10-14 20:22:56,563  - root - get_today_Data_async - 442 - INFO - Updating data V completed..

It means task are executing in sequence.这意味着任务正在按顺序执行。 There are around 500 symbols.大约有 500 个符号。 Can you please help me out so i can execute task in Parallel你能帮我吗,这样我就可以并行执行任务

In python there is theoretically no parallel execution at any given time.在python中,理论上在任何给定时间都没有并行执行。

Python's global interpreter lock (GIL) is a complex mechanism that I will not explain here, you can read about it if you would like, but it prevent python code to run in two different threads at the same time . Python 的全局解释器锁 (GIL) 是一种复杂的机制,我不会在这里解释,如果您愿意,可以阅读它,但它会阻止 Python 代码同时在两个不同的线程运行。

So why still use threading / parallel processing?那么为什么仍然使用线程/并行处理? In Python, solving I/O (input output) problems, is a classic fit to parallel processing, il explain with an example.在 Python 中,解决 I/O(输入输出)问题是并行处理的经典之选,我将举例说明。 If you have a code that make HTTP requests, because network data transfer is WAY slower than cpu processing, to make your code most efficient, you would rather make the request at one thread, than instead of making the program stuck and wait for response, continue make requests with other threads, than for each returning response, take care of the output you got from that response.如果你有一个发出 HTTP 请求的代码,因为网络数据传输比 cpu 处理慢得多,为了使你的代码最有效,你宁愿在一个线程上发出请求,而不是让程序卡住并等待响应,继续使用其他线程发出请求,而不是针对每个返回的响应,处理从该响应中获得的输出。

That's why in Python, a lot of problems probably should not be multithreaded, while in other languages it has some benefits.这就是为什么在 Python 中,很多问题可能不应该是多线程的,而在其他语言中它有一些好处。

One way you can achieve true parallel processing with python is with the multiprocessing module.使用 python 实现真正并行处理的一种方法是使用multiprocessing模块。 But keep in mind that it will have more RAM usage than a normal python execution because you have multiple identical stacks in memory, and it will not necessarily will be quicker because it takes time to open and close processes.但请记住,它会比普通的 Python 执行有更多的 RAM 使用量,因为您在内存中有多个相同的堆栈,而且它不一定会更快,因为打开和关闭进程需要时间。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM