简体   繁体   中英

Getting TypeError: can't pickle _thread.lock objects

I am querying MongoDB to get list of dictionary and for each dict in the list, I am doing some comparison of values. Based on the result of comparison, I am storing the values of dictionary, comparison result, and other values calculated in a mongoDB collection. I am trying to do this by invoking multiprocessing and am getting this error.

def save_for_doc(doc_id):

    #function to get the fields of doc
    fields = get_fields(doc_id)
    no_of_process = 5
    doc_col_size = 30000
    chunk_size = round(doc_col_size/no_of_process)
    chunk_ranges = range(0, no_of_process*chunk_size, chunk_size)
    processes = [ multiprocessing.Process(target=save_similar_docs, args= 
    (doc_id,client,fields,chunks,chunk_size)) for chunks in chunk_ranges]
    for prc in processes:
       prc.start()

def save_similar_docs(arguments):

     #This function process the args and saves the results to MongoDB. Does not 
     #return anything as the end result is directly stored.

Below is the error:

 File "H:/Desktop/Performance Improvement/With_Process_Pool.py", line 144, 
 in save_for_doc
   prc.start()

 File "C:\ProgramData\Anaconda3\lib\multiprocessing\process.py", line 105, 
 in start
  self._popen = self._Popen(self)

 File "C:\ProgramData\Anaconda3\lib\multiprocessing\context.py", line 223, 
 in _Popen
   return _default_context.get_context().Process._Popen(process_obj)

 File "C:\ProgramData\Anaconda3\lib\multiprocessing\context.py", line 322, 
 in _Popen
   return Popen(process_obj)

 File "C:\ProgramData\Anaconda3\lib\multiprocessing\popen_spawn_win32.py", 
 line 65, in __init__
reduction.dump(process_obj, to_child)

 File "C:\ProgramData\Anaconda3\lib\multiprocessing\reduction.py", line 60, 
 in dump
        ForkingPickler(file, protocol).dump(obj)

        TypeError: can't pickle _thread.lock objects

What does this error mean? Please explain and how can I get over.

The documentation says that you can't copy a client from a main process to a child process, you have to create the connection after you fork. The client object cannot be copied, create connections, after you fork the process.

On Unix systems the multiprocessing module spawns processes using fork(). Care must be taken when using instances of MongoClient with fork(). Specifically, instances of MongoClient must not be copied from a parent process to a child process. Instead, the parent process and each child process must create their own instances of MongoClient.

http://api.mongodb.com/python/current/faq.html#id21

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM