I have a dataframe of 12000 rows. I want to use pandas with multiprocessing and perform mapping on the dataframe.
df = pd.read_csv(input_file, dtype=str, names=columns)
df_split = np.array_split(df, 4)
# pool = mp.Pool(4)
for df_data in df_split:
param = [df_data, version, logger]
with mp.Pool(4) as pool:
out_df_lst = pool.map(func, param)
out_df = pd.concat(out_df_lst)
All this program is within a Django REST API, when I make a POST request through Postman it throws the error: 'TypeError: can't pickle _thread.RLock objects'. The program works as intended when I make a request without any multiprocessing.
Please help me understand this issue to make the program work with multiprocessing.
This is the entire Traceback:
Traceback (most recent call last):
File "C:\Users\kotamrajua\Anaconda3\envs\ahrqcomenv\lib\threading.py", line 916, in _bootstrap_inner
self.run()
File "C:\Users\kotamrajua\Anaconda3\envs\ahrqcomenv\lib\threading.py", line 864, in run
self._target(*self._args, **self._kwargs)
File "C:\Users\kotamrajua\methodologies-pyapis\ahrq_comorbidity\ahrq_comorb_app\scripts\process_ahrq.py", line 97, in run_ahrq_process
out_df_lst = pool.map(run_comorb_mapping, param)
File "C:\Users\kotamrajua\Anaconda3\envs\ahrqcomenv\lib\multiprocessing\pool.py", line 266, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "C:\Users\kotamrajua\Anaconda3\envs\ahrqcomenv\lib\multiprocessing\pool.py", line 644, in get
raise self._value
File "C:\Users\kotamrajua\Anaconda3\envs\ahrqcomenv\lib\multiprocessing\pool.py", line 424, in _handle_tasks
put(task)
File "C:\Users\kotamrajua\Anaconda3\envs\ahrqcomenv\lib\multiprocessing\connection.py", line 206, in send
self._send_bytes(_ForkingPickler.dumps(obj))
File "C:\Users\kotamrajua\Anaconda3\envs\ahrqcomenv\lib\multiprocessing\reduction.py", line 51, in dumps
cls(buf, protocol).dump(obj)
TypeError: can't pickle _thread.RLock objects
from pandarallel import pandarallel
pandarallel.initialize()
# Include raw=True if the func needs ndarrays.
out_df = df.parallel_apply(func, args=(version, logger), axis=1)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.