I'm very new to multi-threading. I have 2 functions in my python script. One function enqueue_tasks
iterates through a large list of small items and performs a task on each item which involves appending an item to a list (lets call it master_list
). This I already have multi-threaded using futures.
executor = concurrent.futures.ThreadPoolExecutor(15) # Arbitrarily 15
futures = [executor.submit(enqueue_tasks, group) for group in grouper(key_list, 50)]
concurrent.futures.wait(futures)
I have another function process_master
that iterates through the master_list
above and checks the status of each item in the list, then does some operation.
Can I use the same method above to use multi-threading for process_master
? Furthermore, can I have it running at the same time as enqueue_tasks
? What are the implications of this? process_master
is dependent on the list from enqueue_tasks
, so will running them at the same time be a problem? Is there a way I can delay the second function a bit? (using time.sleep
perhaps)?
No, this isn't safe. If enqueue_tasks
and process_master
are running at the same time, you could potentially be adding items to master_list
inside enqueue_tasks
at the same time process_master
is iterating over it. Changing the size of an iterable while you iterate over it causes undefined behavior in Python, and should always be avoided. You should use a threading.Lock
to protect the code that adds items to master_list
, as well as the code that iterates over master_list
, to ensure they never run at the same time.
Better yet, use a Queue.Queue
( queue.Queue
in Python 3.x) instead of a list
, which is a thread-safe data structure. Add items to the Queue
in enqueue_tasks
, and get
items from the Queue
in process_master
. That way process_master
can safely run a the same time as enqueue_tasks
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.