Hi I am writing a simple script with multiple processes. Here is class that I am using:
class WorkerProcess(multiprocessing.Process):
def __init__(self, batch):
multiprocessing.Process.__init__(self)
self.batch = batch
self.data_frame = pd.DataFrame()
def run(self):
temp = []
for item in self.batch:
temp.append(item)
self.data_frame = pd.DataFrame(temp, columns=temp[0].keys())
print('empty: ', self.data_frame.empty) # everything is fine
Later I start processes and join them:
workers = []
for i in range(max_processes):
try:
batch = batches_data.pop()
workers.append(WorkerProcess(batch))
except Exception as e:
pass
for worker in workers:
worker.start()
for worker in workers:
worker.join()
for worker in workers:
print(worker.data_frame) # it is empty
When I print data_frame it is empty even though is was changed in run() function.
What am I missing?
Processes do not share their memory address space. Thanks to Linux process forking strategy, you usually get the feeling the child process share its memory with the parent but in reality it's a copy.
This means that changes in the child process will not be reflected in its parent (or in any other process).
The Python multiprocessing library offers several mechanisms to share memory between processes .
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.