简体   繁体   English

Tf-Agents ParallelPyEnvironment静默失败

[英]Tf-Agents ParallelPyEnvironment fails silently

I have written a custom environment so I can play around with reinforcement learning (PPO) and tf-agents. 我已经编写了一个自定义环境,因此我可以玩增强学习(PPO)和TF代理。 This works fine if I wrap my env ( which inherits from py_environment.PyEnvironment) in a TfPyEnvironment , but fails if I try to wrap it into a ParallelPyEnvironment . 如果我将env(继承自py_environment.PyEnvironment)包装在TfPyEnvironment ,则此方法TfPyEnvironment ,但是如果我尝试将其包装到ParallelPyEnvironment ,则失败。 I have tried playing around with all the keyword arguments of ParallelPyEnvironment but the code just runs up to the line and then nothing happens - no Exception, the program does not terminate etc. 我尝试使用ParallelPyEnvironment所有关键字参数,但是代码只运行到该行,然后什么也没发生-没有异常,该程序不会终止,等等。

Here is my code initialising the environment and showing off the working variant for the eval_env : 这是我的代码初始化环境并展示eval_env的工作变体:

train_env = tf_py_environment.TFPyEnvironment(
    ParallelPyEnvironment(
        [CardGameEnv()] * hparams['parallel_environments']
    )
)
# this works perfectly:
eval_env = tf_py_environment.TFPyEnvironment(CardGameEnv(debug=True))

If I terminate the script via CTRL+C , this is what is being output: 如果我通过CTRL+C终止脚本,则正在输出:

Traceback (most recent call last):
Traceback (most recent call last):
  File "E:\Users\tmp\Documents\Programming\Neural Nets\Poker_AI\poker_logic\train.py", line 229, in <module>
  File "<string>", line 1, in <module>
    train(model_num=3)
  File "C:\Python37\lib\multiprocessing\spawn.py", line 105, in spawn_main
  File "E:\Users\tmp\Documents\Programming\Neural Nets\Poker_AI\poker_logic\train.py", line 64, in train
    [CardGameEnv()] * hparams['parallel_environments']
    exitcode = _main(fd)
  File "E:\Users\tmp\AppData\Roaming\Python\Python37\site-packages\gin\config.py", line 1009, in wrapper
  File "C:\Python37\lib\multiprocessing\spawn.py", line 113, in _main
    preparation_data = reduction.pickle.load(from_parent)
KeyboardInterrupt
    return fn(*new_args, **new_kwargs)
  File "C:\Python37\lib\site-packages\tf_agents\environments\parallel_py_environment.py", line 70, in __init__
    self.start()
  File "C:\Python37\lib\site-packages\tf_agents\environments\parallel_py_environment.py", line 83, in start
    env.start(wait_to_start=self._start_serially)
  File "C:\Python37\lib\site-packages\tf_agents\environments\parallel_py_environment.py", line 223, in start
    self._process.start()
  File "C:\Python37\lib\multiprocessing\process.py", line 112, in start
    self._popen = self._Popen(self)
  File "C:\Python37\lib\multiprocessing\context.py", line 223, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "C:\Python37\lib\multiprocessing\context.py", line 322, in _Popen
    return Popen(process_obj)
  File "C:\Python37\lib\multiprocessing\popen_spawn_win32.py", line 65, in __init__
    reduction.dump(process_obj, to_child)
  File "C:\Python37\lib\multiprocessing\reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
  File "C:\Python37\lib\site-packages\tf_agents\environments\parallel_py_environment.py", line 264, in __getattr__
    return self._receive()
  File "C:\Python37\lib\site-packages\tf_agents\environments\parallel_py_environment.py", line 333, in _receive
    message, payload = self._conn.recv()
  File "C:\Python37\lib\multiprocessing\connection.py", line 250, in recv
    buf = self._recv_bytes()
  File "C:\Python37\lib\multiprocessing\connection.py", line 306, in _recv_bytes
    [ov.event], False, INFINITE)
KeyboardInterrupt
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
  File "C:\Python37\lib\site-packages\tf_agents\environments\parallel_py_environment.py", line 289, in close
    self._process.join(5)
  File "C:\Python37\lib\multiprocessing\process.py", line 139, in join
    assert self._popen is not None, 'can only join a started process'
AssertionError: can only join a started process

From that I conclude that the thread ParallelPyEnvironment is trying to start does not do that, but since I'm not very experienced with threading in Python, I have no idea where to go from here, especially how to fix this. 由此得出的结论是, ParallelPyEnvironment正在尝试启动的线程并不能做到这一点,但是由于我对Python线程的使用经验不是很丰富,所以我不知道从这里开始应该走什么路,特别是如何解决这个问题。 Current training takes a long time and does not use my PC's capabilities at all (3GB of 32GB RAM used, processor at 3%, GPU barely working at all but VRAM full), so this should speed up training time significantly. 当前的培训需要很长时间,并且根本没有使用我的PC的功能(使用了3GB的32GB RAM,处理器使用了3%,GPU几乎无法工作,但VRAM已满),因此这将大大缩短培训时间。

The solution is to pass in callables, not environments, so the ParallelPyEnvironment can construct them itself: 解决方案是传递可调用对象,而不是环境,因此ParallelPyEnvironment可以自己构造它们:

train_env = tf_py_environment.TFPyEnvironment(
    ParallelPyEnvironment(
        [CardGameEnv] * hparams['parallel_environments']
    )
)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM