如何編寫帶預加載的caffe python數據層？

Question

如何在執行其他處理的同時編寫異步數據層來預加載批次？ 有一些示例代碼嗎？ 謝謝

Answer 1

有幾種方法可以達到你想要的效果。 我會嘗試在這里勾勒出一個選項。

系統的整體視圖是：您有n Loader以異步方式加載數據並提供隊列。 然后，該層從隊列中讀取batch_size項，並在forward()函數中提供網絡。

import caffe, multiprocessing

class Loader(multiprocessing.Process):
  def __init__(self, outq, *args, **kwargs):
    super(Loader, self).__init__()
    self.daemon = True
    self.outq = outq
    self.start()  # start working

  def run(self):
    while True:  # read and never stop at all!
      try:
        # do your magic here
        # assuming you load x,y pairs
        self.outq.put((x[None, ...], y[None, ...]))  # add singleton "batch" dimension
      except Exception as e:
        # handle errors?
        pass

 class MultiProcessInputLayer(caffe.Layer):
   def setup(self, bottom, top):
     # verify no bottoms, right number of tops etc.
     self.dataQ = multiprocessing.Queue()
     for _ in xrange(n):
       Loader(self.dataQ)  # start n Loaders
     # some other stuff here...

   def reshape(self, bottom, top):
     # reshape the inputs to the right sizes

   def forward(self, bottom, top):
     for i in xrange(batch_size):
       item = self.dataQ.get()
       top[0].data[i, ...] = item[0]
       top[1].data[i, ...] = item[1]

   def backward(self, top, propagate_down, bottom):
     pass  # no backward for data layer

我學到了很多技巧和竅門：
1.由於GIL，使用multiprocessing而不是threading包。
2.有時（例如，如果batch_size非常大）， forward()需要很長時間才能從隊列中逐項讀取以形成每個批次。 在這種情況下，您可以添加另一個multiprocessing.Process ，它將從self.dataQ異步讀取batch_size項，並將整批編寫到self.batchQ 。 然后forward()只會在每次調用時等待self.batchQ中的單個項目。
3.注意不要過多地復制數據。 使用大圖像/標簽可以使所有這些復制成為瓶頸。

如何編寫帶預加載的caffe python數據層？

問題描述

1 個解決方案

解決方案1
4 2018-01-02 17:50:00

如何編寫帶預加載的caffe python數據層？

問題描述

1 個解決方案

解決方案1 4 2018-01-02 17:50:00

解決方案1
4 2018-01-02 17:50:00