在 Caffe2 上启用多线程

Question

使用 Caffe2 编译我的程序时，我收到以下警告：

[E init_intrinsics_check.cc:43] CPU feature avx is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
[E init_intrinsics_check.cc:43] CPU feature avx2 is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
[E init_intrinsics_check.cc:43] CPU feature fma is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.

因为我确实想获得对 Caffe2 的多线程支持，所以我已经搜索了要做什么。 我发现在创建cmake或在CMakeLists中必须重新编译 Caffe2 设置一些 arguments 。

由于我已经在conda环境中安装了pytorch ，因此我首先卸载了 Caffe2：

pip uninstall -y caffe2

然后我按照Caffe2 文档中的说明从源代码构建它。 我首先按照指示安装了依赖项。 然后我在我的conda env 中下载pytorch ：

git clone https://github.com/pytorch/pytorch.git && cd pytorch
git submodule update --init --recursive

这时候我想是时候改变刚刚下载的pytorch\caffe2\CMakeLists文件了。 我已经读过，为了启用多线程支持足以启用此CMakeLists中的选项USE_NATIVE_ARCH ，但是我无法在我正在寻找的地方找到这样的选项。 也许我做错了什么。 有什么想法吗？ 谢谢。

这里有一些关于我的平台的细节：

我在 macOS Big Sur 上
我的python版本是3.8.5

更新：

要回答 Nega，这就是我所拥有的：

python3 -c 'import torch; print(torch.__config__.parallel_info())'
ATen/Parallel:
    at::get_num_threads() : 1
    at::get_num_interop_threads() : 4
OpenMP not found
Intel(R) Math Kernel Library Version 2020.0.2 Product Build 20200624 for Intel(R) 64 architecture applications
    mkl_get_max_threads() : 4
Intel(R) MKL-DNN v0.21.1 (Git Hash 7d2fd500bc78936d1d648ca713b901012f470dbc)
std::thread::hardware_concurrency() : 8
Environment variables:
    OMP_NUM_THREADS : [not set]
    MKL_NUM_THREADS : [not set]
ATen parallel backend: OpenMP

更新 2：

原来XCode自带的Clang不支持OpenMP。 我使用的gcc只是 Clang 的符号链接。 事实上，在运行gcc --version之后，我得到了：

Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/c++/4.2.1
Apple clang version 12.0.0 (clang-1200.0.32.29)
Target: x86_64-apple-darwin20.3.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

我从 Homebrew gcc-10安装并设置这样的alias gcc='gcc-10' 。 事实上，现在使用gcc --version这就是我得到的：

gcc-10 (Homebrew GCC 10.2.0_4) 10.2.0
Copyright (C) 2020 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

我还使用 8 个线程为 OpenMP 尝试了一个简单的 Hello World，一切似乎都在工作。 但是重新运行命令后：

python3 -c 'import torch; print(torch.__config__.parallel_info())'

我得到同样的结果。 有什么想法吗？

Answer 1

AVX、AVX2 和 FMA 是 CPU 指令集，与多线程无关。 如果 pytorch/caffe2 的 pip package 在不支持它们的 CPU 上使用这些指令，则该软件将无法工作。 Pytorch，通过pip安装虽然启用了多线程。 您可以使用torch.__config__.parallel_info()确认这一点

❯ python3 -c 'import torch; print(torch.__config__.parallel_info())'
ATen/Parallel:
    at::get_num_threads() : 6
    at::get_num_interop_threads() : 6
OpenMP 201107 (a.k.a. OpenMP 3.1)
    omp_get_max_threads() : 6
Intel(R) Math Kernel Library Version 2020.0.1 Product Build 20200208 for Intel(R) 64 architecture applications
    mkl_get_max_threads() : 6
Intel(R) MKL-DNN v1.6.0 (Git Hash 5ef631a030a6f73131c77892041042805a06064f)
std::thread::hardware_concurrency() : 12
Environment variables:
    OMP_NUM_THREADS : [not set]
    MKL_NUM_THREADS : [not set]
ATen parallel backend: OpenMP

话虽如此，如果您仍想继续从源代码构建 pytorch 和 caffe2，则您要查找的标志 USE_NATIVE 位于pytorch/CMakeLists.txt中，比caffe2高一级。 编辑该文件并将 USE_NATIVE 更改为 ON。 然后使用python3 setup.py build继续构建 pytorch 。 请注意，标志 USE_NATIVE 不会像您认为的那样做。 它只允许使用 CPU 原生优化标志构建 MKL-DNN。 它不会渗透到caffe2 （除非caffe2明显使用 MKL-DNN。）

在 Caffe2 上启用多线程

问题描述

1 个解决方案

解决方案1
3 已采纳 2021-02-25 08:48:21

在 Caffe2 上启用多线程

问题描述

1 个解决方案

解决方案1 3 已采纳 2021-02-25 08:48:21

解决方案1
3 已采纳 2021-02-25 08:48:21