![](/img/trans.png)
[英]Keep trying getting the TypeError: '<' not supported between instances of 'int' and 'list' error
[英]Shift from single gpu to multiple gpu.Throws an error TypeError: '<' not supported between instances of 'list' and 'int'
我已经从使用单个gpu转移到多个gpu。 该代码引发错误
epoch main/loss validation/main/loss elapsed_time
Exception in main training loop: '<' not supported between instances of
'list' and 'int'
Traceback (most recent call last):
File "/home/ubuntu/anaconda3/envs/chainer_p36/lib/python3.6/site-
packages/chainer/training/trainer.py", line 318, in run
entry.extension(self)
File "/home/ubuntu/anaconda3/envs/chainer_p36/lib/python3.6/site-
packages/chainer/training/extensions/evaluator.py", line 157, in
__call__
result = self.evaluate()
File "/home/ubuntu/anaconda3/envs/chainer_p36/lib/python3.6/site-
packages/chainer/training/extensions/evaluator.py", line 206, in evaluate
in_arrays = self.converter(batch, self.device)
File "/home/ubuntu/anaconda3/envs/chainer_p36/lib/python3.6/site-
packages/chainer/dataset/convert.py", line 150, in concat_examples
return to_device(device, _concat_arrays(batch, padding))
File "/home/ubuntu/anaconda3/envs/chainer_p36/lib/python3.6/site-
packages/chainer/dataset/convert.py", line 35, in to_device
elif device < 0:
在重新启动异常之前,将最终确定培训师扩展和更新程序。
我试过没有使用gpu它工作正常。 但是当使用单个gpu时,得到了内存不足的错误。所以,移动了p28xlarge实例,现在它抛出了上面的错误。问题出在哪里,如何解决?
num_gpus = 8
chainer.cuda.get_device_from_id(0).use()
3.#更新者
if num_gpus > 0:
updater = training.updater.ParallelUpdater(
train_iter,
optimizer,
devices={('main' if device == 0 else str(device)): device for
device in range(num_gpus)},
)
else:
updater = training.updater.StandardUpdater(train_iter, optimizer,
device=args.gpus)
4.和儿子.. 5.培训:
trainer.run()
输出 - epoch main / loss验证/ main / loss elapsed_time主训练循环中的异常:'list'和'int'实例之间不支持'<'
我期望输出为
epoch main/loss validation/main/loss elapsed_time
1.
2.
3. and so on till it converge's.
当它将数据传输到指定device
时,它似乎是由Evaluator
扩展引起的错误。 你如何指定device
Evalutor.__init__
? 请注意,它应该是单个设备。 也许这个例子可以作为参考https://github.com/chainer/chainer/blob/master/examples/mnist/train_mnist_data_parallel.py
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.