CUDA的可能原因是Python3 / Theano获取设备属性错误？

Question

I'm trying to use multiple GPUs with multiprocessing in Python3. 我正在尝试在Python3中使用多个GPU进行多重处理。 I can run a simple test case, like the following: 我可以运行一个简单的测试用例，如下所示：

import theano
import theano.tensor as T
import multiprocessing as mp
import time
# import lasagne

def target():
    import theano.sandbox.cuda
    print("target about to use")
    theano.sandbox.cuda.use('gpu1')
    print("target is using")
    import lasagne
    time.sleep(15)
    print("target is exiting")

x = T.scalar('x', dtype='float32')

p = mp.Process(target=target)

p.start()

time.sleep(1)
import theano.sandbox.cuda
print("master about to use")
theano.sandbox.cuda.use('gpu0')
print("master is using")
import lasagne
time.sleep(4)
print("master will join")

p.join()
print("master is exiting")

When I run this, I get the master and the spawned process each using a GPU successfully: 运行此命令时，可以成功使用GPU分别获得主控和生成的进程：

>> target about to use
>> master about to use
>> Using gpu device 1: GeForce GTX 1080 (CNMeM is enabled with initial size: 50.0% of memory, cuDNN 5105)
>> target is using
>> Using gpu device 0: GeForce GTX 1080 (CNMeM is enabled with initial size: 50.0% of memory, cuDNN 5105)
>> master is using
>> master will join
>> target is exiting
>> master is exiting

But in a more complex code-base, when I try to set up the same scheme, the spawned worker fails with: 但是在更复杂的代码库中，当我尝试设置相同的方案时，产生的工作程序失败，并显示以下信息：

ERROR (theano.sandbox.cuda): ERROR: Not using GPU. Initialisation of device 1 failed:
Unable to get properties of gpu 1: initialization error
ERROR (theano.sandbox.cuda): ERROR: Not using GPU. Initialisation of device gpu failed:
Not able to select available GPU from 2 cards (initialization error).

And I'm having a hard time chasing down what's causing this. 而且我很难追究造成这种情况的原因。 In the code snippet above, the problem is recreated if lasagne is imported at the top, before forking. 在上面的代码段中，如果在分叉之前从顶部进口lasagne ，则会重新产生问题。 But I've managed to prevent my code from importing lasagne until after forking and trying to use a GPU (I checked sys.modules.keys() ), and still the problem persists. 但是我设法阻止了我的代码导入lasagne直到分叉并尝试使用GPU之后（我检查了sys.modules.keys() ），问题仍然存在。 I don't see anything Theano related except for theano itself and theano.tensor being imported before forking, but in the example above that's fine. 除了theano本身和theano.tensor在分叉之前导入之外，我没有看到与Theano相关的任何内容，但是在上面的示例中，这很好。

Has anyone else chased down anything similar? 还有其他人追逐过类似的东西吗？

Answer 1

I've been through a similar problem before when trying to configure Theano with Python3 in a Windows PC with the GTX-980. 在尝试使用GTX-980的Windows PC中使用Python3配置Theano之前，我曾经历过类似的问题。 It worked fine with the CPU, but it just doesn't use the GPU. 它可以在CPU上正常工作，但不使用GPU。

After which, I tried configuring it with Python2/Theano, and the problem was resolved. 之后，我尝试使用Python2 / Theano配置它，此问题已解决。 I suppose it could be something wrong with the CUDA version. 我想CUDA版本可能有问题。 You could give Python2/Theano a try (with a virtual environment if needed). 您可以尝试使用Python2 / Theano（如果需要，可以使用虚拟环境）。

Answer 2

OK this turned out to be very simple... I had a stray import theano.sandbox.cuda in a pre-fork location, but this needs to happen only after forking. 好的，事实证明这很简单...我在前叉位置杂散地import theano.sandbox.cuda ，但这仅需要在分叉之后才发生。 It was still necessary to also move lasagne imports to after the fork, in case that helps anyone else. 为了帮助其他人，仍然有必要将lasagne进口也转移到叉子之后。

(In my case, I actually need information from lasagne -based code before the fork, so I have to spawn a throw-away process which loads that and gives the relevant values back to the master thread. The master can then build shared objects accordingly, fork, and subsequently each process builds its own lasagne -based objects which work on its own GPU.) （在我的情况下，实际上我在派生之前需要来自基于lasagne的代码的信息，因此我必须产生一个抛出过程，该过程将进行加载并将相关值返回给主线程。然后，主服务器可以相应地构建共享对象。，fork，然后每个进程都会构建自己的基于lasagne的对象，这些对象可以在自己的GPU上运行。）

CUDA的可能原因是Python3 / Theano获取设备属性错误？

问题描述

2 个解决方案

解决方案1
0 2016-11-18 04:48:32

解决方案2
0 已采纳 2016-11-23 01:28:21

CUDA的可能原因是Python3 / Theano获取设备属性错误？

问题描述

2 个解决方案

解决方案1 0 2016-11-18 04:48:32

解决方案2 0 已采纳 2016-11-23 01:28:21

解决方案1
0 2016-11-18 04:48:32

解决方案2
0 已采纳 2016-11-23 01:28:21