[英]Use rdtsc function in Assembly
I am trying to profile an x86 Assembly program using Ubuntu 12.04. 我试图使用Ubuntu 12.04配置x86汇编程序。 I'd like to use the rdtsc function. 我想使用rdtsc函数。 The problem is, according to a comment, that I should get the number of cycles in rdx but with the following code I get a too high number: 问题是,根据评论,我应该得到rdx中的周期数,但是使用下面的代码我得到一个太高的数字:
SECTION .bss
SECTION .dat
SECTION .text
global main
main:
nop
cpuid
rdtsc
shl rdx, 32
or rdx, rax
mov r8, rdx
xor esi,esi
mov esi,19 ; instructions to be monitored
cpuid
rdtsc
shl rdx, 32
or rdx, rax
sub rdx, r8
Running it in a debugger I get the following results on registers after the sub instruction: 在调试器中运行它我在子指令后的寄存器上得到以下结果:
rax 0xd88102bc
rbx 0x0
rcx 0xf0
rdx 0x44f3914a0
rsi 0x13
rdi 0x1
rbp 0x0
rsp 0x7fffffffdf38
r8 0x11828947ee1c
I can't figure out why the number of cycles in rdx is so high for so simple instructions. 我无法弄清楚为什么rdx中的循环次数如此之高以至于这么简单的指令。 Is the right number in rcx? rcx中的数字是正确的吗? Isn't it too high too? 是不是太高了?
Thanks in advance 提前致谢
I'm not sure what's happening, but when you're calling C functions from assembler you should usually prefix them with a leading underscore, for example call _clock
. 我不确定发生了什么,但是当你从汇编程序调用C函数时,通常应该在它们call _clock
加上一个前导下划线,例如call _clock
。 This is because the C compiler will add this prefix to all functions it generates. 这是因为C编译器会将此前缀添加到它生成的所有函数中。
Additionally as you're on a 64-bit architecture the 64-bit result should end up in rax
, you should ensure you're looking at that, not eax
and ebx
. 另外,当你使用的是64位架构时,64位结果应该以rax
结束,你应该确保你看到它,而不是eax
和ebx
。
Finally I'd suggest rather than using clock
you should use the assembler instruction rdtsc
. 最后我建议你不要使用clock
你应该使用汇编程序指令rdtsc
。 This will return a 64-bit result in edx:eax
. 这将在edx:eax
返回64位结果。 It's relative rather than absolute and is measured in cycles rather than some fractions of seconds, but it should be exactly what you need for profiling. 它是相对的而不是绝对的,是以周期而不是几分之一秒来衡量的,但它应该是你需要进行分析的。
Example: 例:
cpuid
rdtsc
shl rdx, 32
or rdx, rax
mov r8, rdx
<expensive assembler code>
cpuid
rdtsc
shl rdx, 32
or rdx, rax
sub rdx, r8
This will leave the number of ticks that elapsed in rdx
. 这将留下rdx
中经过的滴答数。 The cpuid
instructions are to prevent the processor from reordering instructions around the profiling points. cpuid
指令用于防止处理器重新排序分析点周围的指令。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.