簡體   English   中英

Tensflow Keras: TypeError: can't pickle _thread.RLock objects when using multiprocessing

[英]Tensflow Keras: TypeError: can't pickle _thread.RLock objects when using multiprocessing

我在 GitHub 中提出了這個問題: https://github.com/tensorflow/tensorflow/issues/46917

我正在嘗試使用多處理線程來加速我的一些代碼。 其中我必須向每個線程發送一個 Keras model 並使用它來預測一些輸入並進行一些以下計算。 但是,我最終遇到以下錯誤

Tensflow Keras: TypeError: can't pickle _thread.RLock objects

我試過了,

  1. 使用partial來修復 model 參數並使用生成的偏函數。
  2. 克隆 model 並為每個線程使用克隆
  3. 為每個線程保存和重新加載 model
  4. 嘗試使用pathos.multiprocessing但它們都不起作用。

以下是MWE

import tensorflow as tf
from tensorflow import keras
import numpy as np


from multiprocessing import Pool
# from multiprocessing.dummy import Pool as ThreadPool
# from pathos.multiprocessing import ProcessingPool as Pool
from functools import partial


def simple_model():
    model = keras.models.Sequential([
        keras.layers.Dense(units = 10, input_shape = [1]),
        keras.layers.Dense(units = 1, activation = 'sigmoid')
    ])
    model.compile(optimizer = 'sgd', loss = 'mean_squared_error')
    return model

def clone_model(model):
    model_clone = tf.keras.models.clone_model(model)
    model_clone.set_weights(model.get_weights())
    model_clone.build((None, 1))
    model_clone.compile(optimizer = 'sgd', loss = 'mean_squared_error')
    return model_clone

def work(model, seq):
    return model.predict(seq)

def load_model(model_savepath):
    return tf.keras.models.load_model(model_savepath)

def worker(model, n = 4):
    seqences = np.arange(0,100).reshape(n, -1)
    pool = Pool()
    model_savepath = './simple_model.h5'
    model.save(model_savepath)
    model_list = [load_model(model_savepath) for _ in range(n)]
    # model_list = [clone_model(model) for _ in range(n)]
    results = pool.map(work, zip(model_list,seqences))
    # partial_work = partial(work, model=model)
    # results = pool.map(partial_work, seqences)
    pool.close()
    pool.join()
    
    return np.reshape(results, (-1, ))



if __name__ == '__main__':

    model = simple_model()
    out = worker(model, n=4)
    print(out)

這會導致以下錯誤跟蹤:

File "c:/Users/***/Documents/GitHub/COVID-NSF/test4.py", line 42, in <module>
  out = worker(model, n=4)
File "c:/Users/****/Documents/GitHub/COVID-NSF/test4.py", line 30, in worker
  results = pool.map(work, zip(model_list,seqences))
File "C:\Users\****\anaconda3\envs\tf-gpu\lib\multiprocessing\pool.py", line 268, in map
  return self._map_async(func, iterable, mapstar, chunksize).get()
File "C:\Users\****\anaconda3\envs\tf-gpu\lib\multiprocessing\pool.py", line 657, in get
  raise self._value
File "C:\Users\***\anaconda3\envs\tf-gpu\lib\multiprocessing\pool.py", line 431, in _handle_tasks
  put(task)
File "C:\Users\***\anaconda3\envs\tf-gpu\lib\multiprocessing\connection.py", line 206, in send
  self._send_bytes(_ForkingPickler.dumps(obj))
File "C:\Users\***\anaconda3\envs\tf-gpu\lib\multiprocessing\reduction.py", line 51, in dumps
  cls(buf, protocol).dump(obj)
TypeError: can't pickle _thread.RLock objects

@Aaron 感謝您解釋 amahendrakar 對 GitHub 的評論。 我修改了代碼,使代碼將 model 的路徑而不是 model 本身發送到子進程。 下面是工作代碼

import tensorflow as tf
from tensorflow import keras
import numpy as np


# from multiprocessing import Pool
from multiprocessing.dummy import Pool as ThreadPool
from pathos.multiprocessing import ProcessingPool as Pool
from functools import partial
import time

def simple_model():
    model = keras.models.Sequential([
        keras.layers.Dense(units = 10, input_shape = [1]),
        keras.layers.Dense(units = 1, activation = 'sigmoid')
    ])
    model.compile(optimizer = 'sgd', loss = 'mean_squared_error')
    return model

def clone_model(model):
    model_clone = tf.keras.models.clone_model(model)
    model_clone.set_weights(model.get_weights())
    model_clone.build((None, 1))
    model_clone.compile(optimizer = 'sgd', loss = 'mean_squared_error')
    return model_clone

def work(model, seq):
    return model.predict(seq)

def work_new(seq):
    model_savepath = './simple_model.h5'
    model = tf.keras.models.load_model(model_savepath)
    return model.predict(seq)

def load_model(model_savepath):
    return tf.keras.models.load_model(model_savepath)

def worker(model, n = 4):
    seqences = np.arange(0,10*n).reshape(n, -1)
    pool = Pool()
    model_savepath = './simple_model.h5'
    model.save(model_savepath)
    # model_list = [load_model(model_savepath) for _ in range(n)]
    # model_list = [clone_model(model) for _ in range(n)]
    # results = pool.map(work, zip(model_list,seqences))
    # path_list = [[model_savepath] for _ in range(n)]
    # print(np.shape(path_list), np.shape(seqences))
    # work_new_partial = partial(work_new, path=model_savepath)
    results = pool.map(work_new,  seqences)
    # partial_work = partial(work, model=model)
    # results = pool.map(partial_work, seqences)
    pool.close()
    pool.join()
    # print(t1-t0)
    return np.reshape(results, (-1, ))



if __name__ == '__main__':

    model = simple_model()
    t0 = time.perf_counter()
    out = worker(model, n=40)
    t1 = time.perf_counter()

    # print(out)
    print(f"time taken {t1 - t0}")

這導致

time taken 8.521342800000001

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM