I run some jobs in parallel, which can sometime take a long time, so I want the main thread to report on the progress. For example, each hour.
Below is the simplified version of what I came up with. The code will run test_function
in 2 threads with arguments from input_arguments
. Every 5 seconds it will print % of the jobs finished.
import threading
import queue
import time
def test_function(x):
time.sleep(4)
print("Finished ", x)
num_processes = 2
input_arguments = range(10)
# Define a worker which will continuously execute function taking input parameters from the queue
def worker():
while True:
x = q.get()
if x is None:
break
test_function(x)
q.task_done()
# Initialize queue and the threads
q = queue.Queue()
threads = []
for i in range(num_processes):
t = threading.Thread(target=worker)
t.start()
threads.append(t)
# Create a queue of input parameters for function
for item in input_arguments:
q.put(item)
# Report progress every 5 seconds
report_progress(q)
# stop workers
for i in range(num_processes):
q.put(None)
for t in threads:
t.join()
Where report_progress
is defined as following
def report_progress(q):
qsize_init = q.qsize()
while not q.empty():
time.sleep(5)
portion_finished = 1 - q.qsize() / qsize_init
print("run_parallel: {:.1%} jobs are finished".format(portion_finished))
However, I want to report the progress every hour instead of 5 seconds, and if all jobs are finished, the program might just be idle for many minutes.
Another possibility is to define report_progress
differently:
def report_progress(q):
qsize_init = q.qsize()
time_start = time.time()
while not q.empty():
current_time = time.time()
if current_time - time_start > 5:
portion_finished = 1 - q.qsize() / qsize_init
print("run_parallel: {:.1%} jobs are finished".format(portion_finished))
time_start = time.time()
I am worried that constantly checking this condition will drain CPU resources, small portion, but on a scale of hours it could be a lot.
Is there a standard way of handling this?
Python: 3.6
For now I will use a simple solution, suggested in the comments by @Andriy Maletsky.
Main thread will check every few seconds if the q is not empty yet, and it will print a progress message if it has past more than 1 hour since the last report.
time_between_reports = 3600
time_between_checks = 5
def report_progress_until_finished(q):
qsize_init = q.qsize()
last_report_time = time.time()
while not q.empty():
time_elapsed = time.time() - last_report_time
if time_elapsed > time_between_reports:
portion_finished = 1 - q.qsize() / qsize_init
print("run_parallel: {:.1%} jobs are finished".format(portion_finished))
last_report_time = time.time()
time.sleep(time_between_checks)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.