简体   繁体   English

如何计算在Red Hat Enterprise Linux(x86-64)上执行的指令?

[英]How can I count instructions executed on Red Hat Enterprise Linux (x86-64)?

I want to find out how many x86-64 instructions are executed during a given run of a program running on Red Hat Enterprise Linux. 我想知道在Red Hat Enterprise Linux上运行的程序的给定运行期间执行了多少x86-64指令。 I know I can get this information from valgrind but the slowdown is considerable. 我知道我可以从valgrind那里得到这些信息,但是减速是相当可观的。 I also know that we are using Intel Core 2 Quad CPUs (model Q6700) which have hardware performance counters built in. But I don't know of any way to get access to the total number of instructions executed from within a C program. 我也知道我们使用的是内置硬件性能计数器的Intel Core 2 Quad CPU(型号Q6700)。但我不知道有什么方法可以访问C程序中执行的指令总数。

libpapi is the library you are looking for. libpapi是您正在寻找的图书馆。 AMD and Intel chips provide the insn counts. AMD和英特尔芯片提供insn计数。

Performance Application Programming Interface (PAPI) appears to be along the lines of what you are looking for. 性能应用程序编程接口(PAPI)似乎与您正在寻找的一致。

From the website : 来自网站

PAPI aims to provide the tool designer and application engineer with a consistent interface and methodology for use of the performance counter hardware found in most major microprocessors. PAPI旨在为工具设计人员和应用工程师提供一致的接口和方法,以便使用大多数主要微处理器中的性能计数器硬件。

Vince Weaver, a Post Doctoral Research Associate with the Innovative Computing Laboratory at the University of Tennessee, did some PAPI-related work. Vince Weaver是田纳西大学创新计算实验室的博士后研究员,他做了一些与PAPI相关的工作。 The research listed on his web page at UTK looks like it may provide some additional information. 他在UTK 网页上列出的研究表明它可能会提供一些额外的信息。

There are a couple of ways you could go about it, depending on exactly what you need. 根据您的需要,有几种方法可以解决它。 If you just want to find out the total number of potential arguments you could just run objdump on the binary, which will give you the assembly. 如果你只是想找出潜在参数的总数,你可以在二进制文件上运行objdump,这将为你提供程序集。 If you want more detailed information about the actual instructions being hit on a given run-through of the program, you may want to look into DynamoRIO which provides that functionality. 如果您想了解有关程序的给定运行中遇到的实际指令的更多详细信息,您可能需要查看提供该功能的DynamoRIO It is similar to valgrind, but I believe it has a smaller affect on performance. 它类似于valgrind,但我相信它对性能的影响较小。 I was able to throw together a basic instruction counter with it back in September relatively quickly and easily. 我能够相对快速,轻松地将9月份的基本指令计数器放在一起。

If that's no good, you could try checking out PAPI , which is an API that should let you get at the performance counters on your processors. 如果这不好,您可以尝试检查PAPI ,这是一个API,可以让您获得处理器上的性能计数器。 I've never used it, so I can't speak for it, but a friend of mine used it in a project about 6 months ago and said he found it very helpful. 我从来没有用它,所以我不能说话,但是我的一个朋友在大约6个月前的一个项目中使用它,并说他发现它非常有帮助。

The program below access to cycles counter register from C (sorry non portable code, but works fine with gcc). 下面的程序从C访问循环计数器寄存器(抱歉非便携式代码,但与gcc一起工作正常)。 This one is for counting cycles, that is not the same thing as instructions. 这个用于计算周期,这与指令不同。 Modern processors can both use several cycles on the same instruction, or execute several instructions at once. 现代处理器既可以在同一条指令上使用多个周期,也可以一次执行多条指令。 Cycles is usually more interresting that number of instructions, but it depends of your actual purpose. 循环通常更多地用于指令的数量,但这取决于你的实际目的。

Other performances counter can certainly be accessed the same ways (actually I don't even know if there is others), but I will have to look for the actual instruction code to use. 其他性能计数器当然可以以相同的方式访问(实​​际上我甚至不知道是否有其他的),但我将不得不寻找使用的实际指令代码。

static __inline__ unsigned long long rdtsc(void)
{
   unsigned long long int x;
   __asm__ volatile (".byte 0x0f, 0x31" : "=A" (x));
   return x;

} }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM