[英]f2py - automatic multithreading?
I am currently working on a Python code and to gain some speed I used f2py to port some existing Fortran code. 我目前正在处理Python代码,为了提高速度,我使用了f2py来移植一些现有的Fortran代码。 Everything works well and the speedup is amazing.
一切正常,加速效果惊人。 However, I found that the code seems to run on multiple threads now (according to htop), which is something I did not specify anywhere (maybe this is done intrinsically by f2py?).
但是,我发现该代码现在似乎可以在多个线程上运行(根据htop),这是我未在任何地方指定的(也许是由f2py本质上完成的?)。
Here's the command I use to create the module: 这是我用来创建模块的命令:
f2py --f90exec="gfortran" --f90flags="" --noopt \
$(ACMLLIB) $(FFTLIB) $(ACMLINC) $(FFTINC) -c -m fmod myCode.f90
where the variables $(ACMLLIB) $(FFTLIB) $(ACMLINC)
and $(FFTINC)
are paths to the libraries. 其中变量
$(ACMLLIB) $(FFTLIB) $(ACMLINC)
和$(FFTINC)
是库的路径。
It looks like when I run the script, that it takes all the cores it can find. 当我运行脚本时,它占用了它可以找到的所有内核。 I don't have a problem that it does that, but I want to at least be able to control it - how can I do this by, eg setting the number of threads?
我没有这样做的问题,但我至少希望能够控制它-如何通过例如设置线程数来做到这一点?
I suspect, this has something to do with the -pthread option here: 我怀疑,这与-pthread选项有关:
....
....
compiling C sources
编译C源代码
C compiler: x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fno-strict-aliasing -Wdate-time -D_FORTIFY_SOURCE=2 -g -fstack-protector-strong -Wformat -Werror=format-security -fPIC
C编译器:x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fno-strict-aliasing -Wdate-time -D_FORTIFY_SOURCE = 2 -g -fstack-protector-strong -Wformat -Werror =格式安全-fPIC
....
....
This is a piece of the massive output after I compile the Fortran module. 这是我编译Fortran模块之后的大量输出的一部分。 I have no idea how to handle this.
我不知道该如何处理。
ACML, the (now end-of-lifed) math library by AMD can use multiple core, see http://developer.amd.com/tools-and-sdks/archive/compute/amd-core-math-library-acml/acml-product-features/ ACML(AMD现已终止的数学库)可以使用多核,请参阅http://developer.amd.com/tools-and-sdks/archive/compute/amd-core-math-library-acml / acml-product-features /
This is most probably why you see. 这很可能是您看到的原因。 There is a copy of the docs here: https://engineering.ucsb.edu/~stefan/acml.pdf where the use of the environment variable
OMP_NUM_THREADS
is mentioned to control the number of cores/threads to use. 这里有文档的副本: https : //engineering.ucsb.edu/~stefan/acml.pdf ,其中提到了使用环境变量
OMP_NUM_THREADS
来控制要使用的内核/线程数。 That is the standard OpenMP environment variable. 那是标准的OpenMP环境变量。
It would be nice to be able to set the number of f2py threads via an environment variable or something. 能够通过环境变量或其他方式设置f2py线程的数量会很好。 I searched around a bit, but could not find any info about doing that.
我搜索了一下,但找不到有关此操作的任何信息。
Howver, if you're running on linux, say, you can use taskset
command-line utility, which provides a way to pin your process (any process) to a particular cpu core or set of cpu cores. 但是,例如,如果您在Linux上运行,则可以使用
taskset
命令行实用程序,该实用程序提供了一种将进程(任何进程)固定到特定cpu核心或一组cpu核心的方法。 This is a bit crude, but I think it will accomplish what you need. 这有点粗糙,但是我认为它将满足您的需求。
For more info, look here, for instance: http://xmodulo.com/run-program-process-specific-cpu-cores-linux.html 有关更多信息,请参见此处,例如: http : //xmodulo.com/run-program-process-specific-cpu-cores-linux.html
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.