簡體   English   中英

在 python 中安排定期的非阻塞任務而不處理 GIL

[英]Schedule periodic non-blocking tasks in python without dealing with the GIL

我一直在使用 Python 的schedule模塊來安排不會相互阻塞的重疊任務; 但是,實現非阻塞行為的推薦方法是使用 python 的threading模塊(當然,它容易受到GIL的攻擊)。

引用 schedule 模塊的文檔

import threading
import time
import schedule

def job():
    print("I'm running on thread %s" % threading.current_thread())

def run_threaded(job_func):
    job_thread = threading.Thread(target=job_func)
    job_thread.start()

schedule.every(1).seconds.do(run_threaded, job)
schedule.every(2).seconds.do(run_threaded, job)
schedule.every(3).seconds.do(run_threaded, job)
schedule.every(4).seconds.do(run_threaded, job)
schedule.every(5).seconds.do(run_threaded, job)

while 1:
    schedule.run_pending()
    time.sleep(1)

我們如何在不擔心GIL的情況下安排和執行任務?

我們如何在不擔心GIL的情況下安排和執行任務?

解決這個問題的一種相對簡單的方法是使用多處理模塊......這給了我們類似 cron 的行為,而不用擔心 GIL 問題......

避免多處理的 GIL 問題並不是什么創新……我只是在這里記錄,希望能幫助未來的谷歌員工……

from multiprocessing import Process
from datetime import datetime
import time

from schedule import Scheduler

class MPScheduler(Scheduler):
    def __init__(self, args=None, kwargs=None):
        if args is None:
            args = ()
        if kwargs is None:
            kwargs = {}
        super(MPScheduler, self).__init__(*args, **kwargs)
        # Among other things, this object inherits self.jobs (a list of jobs)
        self.args = args
        self.kwargs = kwargs
        self.processes = list()

    def _mp_run_job(self, job_func):
        """Spawn another process to run the job; multiprocessing avoids GIL issues"""
        job_process = Process(target=job_func, args=self.args,
            kwargs=self.kwargs)
        job_process.daemon = True
        job_process.start()
        self.processes.append(job_process)

    def run_pending(self):
        """Run any jobs which are ready"""
        runnable_jobs = (job_obj for job_obj in self.jobs if job_obj.should_run)
        for job_obj in sorted(runnable_jobs):
            job_obj.last_run = datetime.now()   # Housekeeping
            self._mp_run_job(job_obj.job_func)
            job_obj._schedule_next_run()        # Schedule the next execution datetime

        self._retire_finished_processes()

    def _retire_finished_processes(self):
        """Walk the list of processes and retire finished processes"""
        retirement_list = list()   # List of process objects to remove
        for idx, process in enumerate(self.processes):
            if process.is_alive():
                # wait a short time for process to finish
                process.join(0.01)
            else:
                retirement_list.append(idx)

        ## Retire finished processes
        for process_idx in sorted(retirement_list, reverse=True):
            self.processes.pop(process_idx)

def job(id, hungry=True):
    print("{} running {} and hungry={}".format(datetime.now(), id, hungry))
    time.sleep(10)   # This job runs without blocking execution of other jobs

if __name__=='__main__':
    # Build a schedule of overlapping jobs...
    mp_sched = MPScheduler()
    mp_sched.every(1).seconds.do(job, id=1, hungry=False)
    mp_sched.every(2).seconds.do(job, id=2)
    mp_sched.every(3).seconds.do(job, id=3)
    mp_sched.every(4).seconds.do(job, id=4)
    mp_sched.every(5).seconds.do(job, id=5)

    while True:
        mp_sched.run_pending()
        time.sleep(1)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM