简体   繁体   English

带有OpenMP的OProfile

[英]OProfile with OpenMP

I am using OProfile for OpenMP parallelized code by doing the following, 我通过执行以下操作将OProfile用于OpenMP并行化代码,

$ gcc -I/usr/include/hdf5/serial/ -std=c11 -O3 -fopt-info -fopenmp sp_linsvm.c -o sp_linsvm -lhdf5_serial
$ sudo ocount --events=CPU_CLK_UNHALTED,LLC_MISSES,LLC_REFS,MEM_INST_RETIRED,BR_MISP_EXEC, ./sp_linsvm
Events were actively counted for 22.0 seconds.
Event counts (scaled) for /home/aidan/progs/linsvm/sp_linsvm:
    Event                    Count                    % time counted
    BR_MISP_EXEC             6,523,181                80.00
    CPU_CLK_UNHALTED         225,384,009,348          80.00
    LLC_MISSES               276,587,407              80.02
    LLC_REFS                 1,098,236,806            80.00
    MEM_INST_RETIRED         51,754,855,734           79.99

How do I know if the events are counted per CPU or as a whole? 我怎么知道事件是按CPU计数还是作为整体计数? I am pretty sure its as a whole as they are close to the numbers if I compiled without OpenMP, but I want to be sure. 我很确定,从整体上看,如果不使用OpenMP进行编译,它们将接近数字,但我想确定。

Default mode for ocount ... ./program is "command". ocount ... ./program默认模式为“命令”。 As I understand, without -t ( --separate-thread ) or -c ( --separate-cpu ) options, data from all threads is aggregated. 据我了解,如果没有-t (-- --separate-thread )或-c (-- --separate-cpu )选项,则来自所有线程的数据将被聚合。

So, check documentation http://oprofile.sourceforge.net/doc/controlling-counter.html#controlling-ocount and try -t / -c options... 因此,请查看文档http://oprofile.sourceforge.net/doc/controlling-counter.html#controlling-ocount并尝试-t / -c选项...

--separate-thread / -t This option can be used in conjunction with either the --process-list or --thread-list option to display event counts on a per-thread (per-process) basis. --separate-thread / -t此选项可以与--process-list或--thread-list选项一起使用,以基于每个线程(每个进程)显示事件计数。 Without this option, all counts are aggregated. 如果没有此选项,则所有计数都将汇总。

--separate-cpu / -c This option can be used in conjunction with either the --system-wide or --cpu-list option to display event counts on a per-cpu basis. --separate-cpu / -c该选项可以与--system-wide或--cpu-list选项一起使用,以显示每个cpu的事件计数。 Without this option, all counts are aggregated. 如果没有此选项,则所有计数都将汇总。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM