[英]Issue with unsigned long long int with sparc assembly code on sparc64
I have an issue with the C code below into which I have included Sparc Assembly.我在下面的 C 代码中有一个问题,我在其中包含了 Sparc 程序集。 The code is compiled and running on Debian 9.0 Sparc64.
代码在 Debian 9.0 Sparc64 上编译和运行。 It does a simple summation and print the result of this sum which equals to
nLoop
.它做一个简单的求和并打印这个总和的结果,它等于
nLoop
。
The problem is that for an initial number of iterations greater than 1e+9, the final sum at the end is systematically equal to 1410065408 : I don't understand why since I put explicitly unsigned long long int
type for sum
variable and so sum
can be in [0, +18,446,744,073,709,551,615]
range.问题是,对于大于 1e+9 的初始迭代次数,最后的总和系统地等于 1410065408 :我不明白为什么,因为我为
sum
变量显式放置了unsigned long long int
类型,所以sum
可以在[0, +18,446,744,073,709,551,615]
范围内。
For example, for nLoop = 1e+9
, I expect sum
to be equal to 1e+9
.例如,对于
nLoop = 1e+9
,我希望sum
等于1e+9
。
Does issue come rather from included Assembly Sparc code which could not handle 64 bits variables (in input or output) ?问题是否来自无法处理 64 位变量(在输入或输出中)的包含的 Assembly Sparc 代码?
#include <stdio.h>
#include <stdlib.h>
int main (int argc, char *argv[])
{
int i;
// Init sum
unsigned long long int sum = 0ULL;
// Number of iterations
unsigned long long int nLoop = 10000000000ULL;
// Loop with Sparc assembly into C source
asm volatile ("clr %%g1\n\t"
"clr %%g2\n\t"
"mov %1, %%g1\n" // %1 = input parameter
"loop:\n\t"
"add %%g2, 1, %%g2\n\t"
"subcc %%g1, 1, %%g1\n\t"
"bne loop\n\t"
"nop\n\t"
"mov %%g2, %0\n" // %0 = output parameter
: "=r" (sum) // output
: "r" (nLoop) // input
: "g1", "g2"); // clobbers
// Print results
printf("Sum = %llu\n", sum);
return 0;
}
How to fix this problem of range and allow to use 64 bits variables into Sparc Assembly code ?如何解决这个范围问题并允许在 Sparc 汇编代码中使用 64 位变量?
PS: I tried to compile with gcc -m64, issue remains. PS:我尝试用 gcc -m64 编译,问题仍然存在。
As requested by @zwol, below is the output Assembly Sparc code generated with : gcc -O2 -m64 -S loop.c -o loop.s
根据@zwol 的要求,以下是使用以下命令生成的输出 Assembly Sparc 代码:
gcc -O2 -m64 -S loop.c -o loop.s
.file "loop.c"
.section ".text"
.section .rodata.str1.8,"aMS",@progbits,1
.align 8
.LC0:
.asciz "Sum = %llu\n"
.section .text.startup,"ax",@progbits
.align 4
.global main
.type main, #function
.proc 04
main:
.register %g2, #scratch
save %sp, -176, %sp
sethi %hi(_GLOBAL_OFFSET_TABLE_-4), %l7
call __sparc_get_pc_thunk.l7
add %l7, %lo(_GLOBAL_OFFSET_TABLE_+4), %l7
sethi %hi(9764864), %o1
or %o1, 761, %o1
sllx %o1, 10, %o1
#APP
! 13 "loop.c" 1
clr %g1
clr %g2
mov %o1, %g1
loop:
add %g2, 1, %g2
subcc %g1, 1, %g1
bne loop
nop
mov %g2, %o1
! 0 "" 2
#NO_APP
mov 0, %i0
sethi %gdop_hix22(.LC0), %o0
xor %o0, %gdop_lox10(.LC0), %o0
call printf, 0
ldx [%l7 + %o0], %o0, %gdop(.LC0)
return %i7+8
nop
.size main, .-main
.ident "GCC: (Debian 7.3.0-15) 7.3.0"
.section .text.__sparc_get_pc_thunk.l7,"axG",@progbits,__sparc_get_pc_thunk.l7,comdat
.align 4
.weak __sparc_get_pc_thunk.l7
.hidden __sparc_get_pc_thunk.l7
.type __sparc_get_pc_thunk.l7, #function
.proc 020
__sparc_get_pc_thunk.l7:
jmp %o7+8
add %o7, %l7, %l7
.section .note.GNU-stack,"",@progbits
UPDATE 2:更新 2:
As suggested by @Martin Rosenau, I did following modifications :正如@Martin Rosenau 所建议的,我做了以下修改:
loop:
add %g2, 1, %g2
subcc %g1, 1, %g1
bpne %icc, loop
bpne %xcc, loop
nop
mov %g2, %o1
But at the compilation, I get :但在编译时,我得到:
Error: Unknown opcode: `bpne'
What could be the reason for this compilation error ?这个编译错误的原因是什么?
subcc %%g1, 1, %%g1 bne loop
Your problem is the bne
instruction:你的问题是
bne
指令:
Unlike the x86-64 CPU Sparc64 CPUs don't have different instructions for 32- and 64-bit subtraction:与 x86-64 CPU 不同,Sparc64 CPU 对于 32 位和 64 位减法没有不同的指令:
If you want subtract 1 from 0x12345678 the result is 0x12345677.如果你想从 0x12345678 中减去 1,结果是 0x12345677。 If you subtract 1 from 0xF00D 12345678 the result is 0xF00D 12345677 so if you only use the lower 32 bits of a register a 64-bit subtraction has the same effect as the 32-bit subtraction.
如果从 0xF00D 12345678 中减去 1,则结果为 0xF00D 12345677,因此如果仅使用寄存器的低 32 位,则 64 位减法与 32 位减法具有相同的效果。
Therefore the Sparc64 CPUs do not have different instructions for 64-bit and 32-bit addition, subtraction, multiplication, left shift etc.因此,Sparc64 CPU 对 64 位和 32 位加法、减法、乘法、左移等没有不同的指令。
These CPUs have different instructions for 32-bit and 64-bit operations when the upper 32 bits influence the lower 32 bits (eg right shift).当高 32 位影响低 32 位(例如右移)时,这些 CPU 对 32 位和 64 位操作有不同的指令。
However the zero flag depends on the result of the subcc
operation.然而,零标志取决于
subcc
操作的结果。
To solve this problem the Sparc64 CPUs have each of the integer flags (zero, overflow, carry, sign) twice:为了解决这个问题,Sparc64 CPU 将每个整数标志(零、溢出、进位、符号)设置为两次:
The 32-bit zero flag will be set if the lower 32 bits of a register are zero;如果寄存器的低 32位为零,则设置32 位零标志; the 64-bit zero flag will be set if all 64 bits of a register are zero.
如果寄存器的所有 64 位都为零,则将设置64 位零标志。
To be compatible with existing 32-bit programs the bne
instruction will check the 32-bit zero flag, not the 64-bit zero flag.为了与现有的 32 位程序兼容,
bne
指令将检查 32 位零标志,而不是 64 位零标志。
is systematically equal to 1410065408
系统地等于 1410065408
1e10 = 0x200000000 + 1410065408 so after 1410065408 steps the value 0x200000000 is reached which has the lower 32 bits set to 0 and bne
will not jump any more. 1e10 = 0x200000000 + 1410065408 所以在 1410065408 步之后达到值 0x200000000,它的低 32 位设置为 0,
bne
将不再跳转。
However for 1e11 you should not get 1410065408 but 1215752192 as a result because 1e11 = 0x1700000000 + 1215752192.然而,对于 1e11,你不应该得到 1410065408 而是 1215752192,因为 1e11 = 0x1700000000 + 1215752192。
bne
There is a new instruction named bpne
which has up to 4 arguments!有一个名为
bpne
的新指令,它最多有 4 个参数!
In the simplest variant (with only two arguments) the instruction should (I have not used Sparc for 5 years now, so I'm not sure) work like this:在最简单的变体(只有两个参数)中,指令应该(我已经 5 年没有使用 Sparc,所以我不确定)是这样工作的:
bpne %icc, loop # Like bne (based on the 32-bit result)
bpne %xcc, loop # Like bne, but based on the 64-bit result
EDIT编辑
Error: Unknown opcode: 'bpne'
I just tried using GNU assembler:我只是尝试使用 GNU 汇编程序:
GNU assembler names the new instruction bne
- just like the old one: GNU 汇编器将新指令命名为
bne
- 就像旧指令一样:
bne loop # Old variant
bne %icc, loop # New variant based on the 32-bit result
bne %xcc, loop # (New variant) Based on the 64-bit result
subcc %g1, 1, %g1 bpne %icc, loop bpne %xcc, loop nop
The first bpne
(or bne
) makes no sense: Whenever the first line would do the jump the second line would also jump.第一个
bpne
(或bne
)没有意义:每当第一行跳转时,第二行也会跳转。 And if you don't use .reorder
(however this is the default) you would also need to add a nop
between the two branch instructions...如果您不使用
.reorder
(但这是默认设置),您还需要在两个分支指令之间添加一个nop
......
The code should look like this (assuming your assembler also names bpne
bne
):代码应如下所示(假设您的汇编程序也将名称命名为
bpne
bne
):
subcc %g1, 1, %g1
bne %xcc, loop
nop
尝试“bne %xcc, loop”,它应该基于 64 位结果进行分支。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.