sparc64 上带有 sparc 汇编代码的 unsigned long long int 问题

Question

I have an issue with the C code below into which I have included Sparc Assembly.我在下面的 C 代码中有一个问题，我在其中包含了 Sparc 程序集。 The code is compiled and running on Debian 9.0 Sparc64.代码在 Debian 9.0 Sparc64 上编译和运行。 It does a simple summation and print the result of this sum which equals to nLoop .它做一个简单的求和并打印这个总和的结果，它等于nLoop 。

The problem is that for an initial number of iterations greater than 1e+9, the final sum at the end is systematically equal to 1410065408 : I don't understand why since I put explicitly unsigned long long int type for sum variable and so sum can be in [0, +18,446,744,073,709,551,615] range.问题是，对于大于 1e+9 的初始迭代次数，最后的总和系统地等于 1410065408 ：我不明白为什么，因为我为sum变量显式放置了unsigned long long int类型，所以sum可以在[0, +18,446,744,073,709,551,615]范围内。

For example, for nLoop = 1e+9 , I expect sum to be equal to 1e+9 .例如，对于nLoop = 1e+9 ，我希望sum等于1e+9 。

Does issue come rather from included Assembly Sparc code which could not handle 64 bits variables (in input or output) ?问题是否来自无法处理 64 位变量（在输入或输出中）的包含的 Assembly Sparc 代码？

#include <stdio.h>
#include <stdlib.h>

int main (int argc, char *argv[])
{
  int i;
  // Init sum
  unsigned long long int sum = 0ULL;
  // Number of iterations
  unsigned long long int nLoop = 10000000000ULL;

   // Loop with Sparc assembly into C source
   asm volatile ("clr %%g1\n\t"
                 "clr %%g2\n\t"
                 "mov %1, %%g1\n" // %1 = input parameter
                 "loop:\n\t"
                 "add %%g2, 1, %%g2\n\t"
                 "subcc %%g1, 1, %%g1\n\t"
                 "bne loop\n\t"
                 "nop\n\t"
                 "mov %%g2, %0\n" // %0 = output parameter
                 : "=r" (sum)     // output
                 : "r" (nLoop)    // input
                 : "g1", "g2");   // clobbers

  // Print results
  printf("Sum = %llu\n", sum);

  return 0;

}

How to fix this problem of range and allow to use 64 bits variables into Sparc Assembly code ?如何解决这个范围问题并允许在 Sparc 汇编代码中使用 64 位变量？

PS: I tried to compile with gcc -m64, issue remains. PS：我尝试用 gcc -m64 编译，问题仍然存在。

Update 1更新 1

As requested by @zwol, below is the output Assembly Sparc code generated with : gcc -O2 -m64 -S loop.c -o loop.s根据@zwol 的要求，以下是使用以下命令生成的输出 Assembly Sparc 代码： gcc -O2 -m64 -S loop.c -o loop.s

        .file   "loop.c"
        .section        ".text"
        .section        .rodata.str1.8,"aMS",@progbits,1
        .align 8
.LC0:
        .asciz  "Sum = %llu\n"
        .section        .text.startup,"ax",@progbits
        .align 4
        .global main
        .type   main, #function
        .proc   04
main:
        .register       %g2, #scratch
        save    %sp, -176, %sp
        sethi   %hi(_GLOBAL_OFFSET_TABLE_-4), %l7
        call    __sparc_get_pc_thunk.l7
         add    %l7, %lo(_GLOBAL_OFFSET_TABLE_+4), %l7
        sethi   %hi(9764864), %o1
        or      %o1, 761, %o1
        sllx    %o1, 10, %o1
#APP
! 13 "loop.c" 1
        clr %g1
        clr %g2
        mov %o1, %g1
loop:
        add %g2, 1, %g2
        subcc %g1, 1, %g1
        bne loop
        nop
        mov %g2, %o1

! 0 "" 2
#NO_APP
        mov     0, %i0
        sethi   %gdop_hix22(.LC0), %o0
        xor     %o0, %gdop_lox10(.LC0), %o0
        call    printf, 0
         ldx    [%l7 + %o0], %o0, %gdop(.LC0)
        return  %i7+8
         nop
        .size   main, .-main
        .ident  "GCC: (Debian 7.3.0-15) 7.3.0"
        .section        .text.__sparc_get_pc_thunk.l7,"axG",@progbits,__sparc_get_pc_thunk.l7,comdat
        .align 4
        .weak   __sparc_get_pc_thunk.l7
        .hidden __sparc_get_pc_thunk.l7
        .type   __sparc_get_pc_thunk.l7, #function
        .proc   020
__sparc_get_pc_thunk.l7:
        jmp     %o7+8
         add    %o7, %l7, %l7
        .section        .note.GNU-stack,"",@progbits

UPDATE 2:更新 2：

As suggested by @Martin Rosenau, I did following modifications :正如@Martin Rosenau 所建议的，我做了以下修改：

loop:
        add %g2, 1, %g2
        subcc %g1, 1, %g1
        bpne %icc, loop
        bpne %xcc, loop
        nop
        mov %g2, %o1

But at the compilation, I get :但在编译时，我得到：

Error: Unknown opcode: `bpne'

What could be the reason for this compilation error ?这个编译错误的原因是什么？

Answer 1

subcc %%g1, 1, %%g1 bne loop

Your problem is the bne instruction:你的问题是bne指令：

Unlike the x86-64 CPU Sparc64 CPUs don't have different instructions for 32- and 64-bit subtraction:与 x86-64 CPU 不同，Sparc64 CPU 对于 32 位和 64 位减法没有不同的指令：

If you want subtract 1 from 0x12345678 the result is 0x12345677.如果你想从 0x12345678 中减去 1，结果是 0x12345677。 If you subtract 1 from 0xF00D 12345678 the result is 0xF00D 12345677 so if you only use the lower 32 bits of a register a 64-bit subtraction has the same effect as the 32-bit subtraction.如果从 0xF00D 12345678 中减去 1，则结果为 0xF00D 12345677，因此如果仅使用寄存器的低 32 位，则 64 位减法与 32 位减法具有相同的效果。

Therefore the Sparc64 CPUs do not have different instructions for 64-bit and 32-bit addition, subtraction, multiplication, left shift etc.因此，Sparc64 CPU 对 64 位和 32 位加法、减法、乘法、左移等没有不同的指令。

These CPUs have different instructions for 32-bit and 64-bit operations when the upper 32 bits influence the lower 32 bits (eg right shift).当高 32 位影响低 32 位（例如右移）时，这些 CPU 对 32 位和 64 位操作有不同的指令。

However the zero flag depends on the result of the subcc operation.然而，零标志取决于subcc操作的结果。

To solve this problem the Sparc64 CPUs have each of the integer flags (zero, overflow, carry, sign) twice:为了解决这个问题，Sparc64 CPU 将每个整数标志（零、溢出、进位、符号）设置为两次：

The 32-bit zero flag will be set if the lower 32 bits of a register are zero;如果寄存器的低 32位为零，则设置32 位零标志； the 64-bit zero flag will be set if all 64 bits of a register are zero.如果寄存器的所有 64 位都为零，则将设置64 位零标志。

To be compatible with existing 32-bit programs the bne instruction will check the 32-bit zero flag, not the 64-bit zero flag.为了与现有的 32 位程序兼容， bne指令将检查 32 位零标志，而不是 64 位零标志。

is systematically equal to 1410065408系统地等于 1410065408

1e10 = 0x200000000 + 1410065408 so after 1410065408 steps the value 0x200000000 is reached which has the lower 32 bits set to 0 and bne will not jump any more. 1e10 = 0x200000000 + 1410065408 所以在 1410065408 步之后达到值 0x200000000，它的低 32 位设置为 0， bne将不再跳转。

However for 1e11 you should not get 1410065408 but 1215752192 as a result because 1e11 = 0x1700000000 + 1215752192.然而，对于 1e11，你不应该得到 1410065408 而是 1215752192，因为 1e11 = 0x1700000000 + 1215752192。

bne

There is a new instruction named bpne which has up to 4 arguments!有一个名为bpne的新指令，它最多有 4 个参数！

In the simplest variant (with only two arguments) the instruction should (I have not used Sparc for 5 years now, so I'm not sure) work like this:在最简单的变体（只有两个参数）中，指令应该（我已经 5 年没有使用 Sparc，所以我不确定）是这样工作的：

bpne %icc, loop   # Like bne (based on the 32-bit result)
bpne %xcc, loop   # Like bne, but based on the 64-bit result

EDIT编辑

Error: Unknown opcode: 'bpne'

I just tried using GNU assembler:我只是尝试使用 GNU 汇编程序：

GNU assembler names the new instruction bne - just like the old one: GNU 汇编器将新指令命名为bne - 就像旧指令一样：

bne loop         # Old variant
bne %icc, loop   # New variant based on the 32-bit result
bne %xcc, loop   # (New variant) Based on the 64-bit result

 subcc %g1, 1, %g1 bpne %icc, loop bpne %xcc, loop nop

The first bpne (or bne ) makes no sense: Whenever the first line would do the jump the second line would also jump.第一个bpne （或bne ）没有意义：每当第一行跳转时，第二行也会跳转。 And if you don't use .reorder (however this is the default) you would also need to add a nop between the two branch instructions...如果您不使用.reorder （但这是默认设置），您还需要在两个分支指令之间添加一个nop ......

The code should look like this (assuming your assembler also names bpne bne ):代码应如下所示（假设您的汇编程序也将名称命名为bpne bne ）：

   subcc %g1, 1, %g1
   bne %xcc, loop
   nop

Answer 2

尝试“bne %xcc, loop”，它应该基于 64 位结果进行分支。

sparc64 上带有 sparc 汇编代码的 unsigned long long int 问题

问题描述

Update 1更新 1

2 个解决方案

解决方案1
1 已采纳 2018-04-13 06:11:12

解决方案2
0 2018-04-13 14:35:51

sparc64 上带有 sparc 汇编代码的 unsigned long long int 问题

问题描述

Update 1更新 1

2 个解决方案

解决方案1 1 已采纳 2018-04-13 06:11:12

解决方案2 0 2018-04-13 14:35:51

解决方案1
1 已采纳 2018-04-13 06:11:12

解决方案2
0 2018-04-13 14:35:51