简体   繁体   English

可以直接从gpu给Caffe或Caffe2输入数据吗?

[英]Can Caffe or Caffe2 be given input data directly from gpu?

I've read caffe2 tutorials and tried pre-trained models. 我已经阅读了caffe2教程并尝试了预训练模型。 I knew caffe2 will leverge GPU to run the model/net. 我知道caffe2将利用GPU来运行模型/网络。 But the input data seems always be given from CPU(ie. Host) memory. 但是输入数据似乎总是从CPU(即主机)内存中给出的。 For example, in Loading Pre-Trained Models , after model is loaded, we can predict an image by 例如,在“ 加载预先训练的模型”中 ,加载模型后,我们可以通过

result = p.run([img])

However, image "img" should be read in CPU scope. 但是,应在CPU作用域中读取图像“ img”。 What I look for is a framework that can pipline the images (which is decoded from a video and still resides in GPU memory) directly to the prediction model, instead of copying it from GPU to CPU scope, and then transfering to GPU again to predict result. 我要寻找的是一个框架,该框架可以将图像(从视频解码后仍保留在GPU内存中)直接绘制到预测模型,而不是将其从GPU复制到CPU范围,然后再次传输到GPU进行预测结果。 Is Caffe or Caffe2 provides such functions or interfaces for python or C++? Caffe或Caffe2是否为python或C ++提供了此类功能或接口? Or should I need to patch Caffe to do so? 还是我需要给Caffe打补丁? Thanks at all. 谢谢你


Here is my solution: 这是我的解决方案:

I'd found in tensor.h , function ShareExternalPointer() can exactly do what I want. 我在tensor.h发现,功能ShareExternalPointer()可以完全满足我的要求。

Feed gpu data this way, 以这种方式输入gpu数据,

pInputTensor->ShareExternalPointer(pGpuInput, InputSize);

then run the predict net through 然后通过运行预测网

pPredictNet->Run();

where pInputTensor is the entrance tensor for the predict net pPredictNet 其中pInputTensor是预测网pPredictNet的入口张量

I don't think you can do it in with python interface. 我认为您无法使用python接口在完成此操作。
But I think that it can be accomplished using the c++: In c++ you have access to the Blob 's mutable_gpu_data() . 但是我认为可以使用c ++来实现:在c ++中,您可以访问Blobmutable_gpu_data() You may write code that run on device and "fill" the input Blob's mutable_gpu_data() directly from gpu. 您可以编写在设备上运行的代码,然后直接从gpu“填充”输入Blob的mutable_gpu_data() Once you made this update, caffe should be able to continue its net->forward() from there. 完成此更新后,caffe应该能够从此处继续其net->forward()

UPDATE UPDATE
On Sep 19th, 2017 PR #5904 was merged into master. 2017年9月19日, PR#5904合并为master。 This PR exposes GPU pointers of blobs via the python interface. 此PR通过python接口公开blob的GPU指针。
You may access blob._gpu_data_ptr and blob._gpu_diff_ptr directly from python at your own risk . 您可以直接从python直接访问blob._gpu_data_ptrblob._gpu_data_ptrblob._gpu_diff_ptr 自负

As you've noted, using a Python layer forces data in and out of the GPU, and this can cause a huge hit to performance. 正如您已经指出的那样,使用Python层会强制数据进出GPU,这会对性能造成巨大影响。 This is true not just for Caffe, but for other frameworks too. 这不仅适用于Caffe,而且适用于其他框架。 To elaborate on Shai's answer, you could look at this step-by-step tutorial on adding C++ layers to Caffe . 要详细说明Shai的答案,您可以查看有关向Caffe添加C ++层的分步教程 The example given should touch on most issues dealing with layer implementation. 给出的示例应涉及与层实现有关的大多数问题。 Disclosure: I am the author. 披露:我是作者。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM