GCC内联汇编中的一个简单while循环

Question

I want to write the following loop using GCC extended inline ASM: 我想使用GCC扩展内联ASM编写以下循环：

long* arr = new long[ARR_LEN]();
long* act_ptr = arr;
long* end_ptr = arr + ARR_LEN;

while (act_ptr < end_ptr)
{
    *act_ptr = SOME_VALUE;
    act_ptr += STEP_SIZE;
}

delete[] arr;

An array of type long with length ARR_LEN is allocated and zero-initialized. 分配长度为ARR_LEN long类型的数组，并将其初始化为零。 The loop walks through the array with an increment of STEP_SIZE . 循环以STEP_SIZE的增量STEP_SIZE数组。 Every touched element is set to SOME_VALUE . 每个触摸的元素都设置为SOME_VALUE 。

Well, this was my first attempt in GAS: 好吧，这是我第一次在GAS中尝试：

long* arr = new long[ARR_LEN]();

asm volatile
(
    "loop:"
    "movl %[sval], (%[aptr]);"
    "leal (%[aptr], %[incr], 4), %[aptr];"
    "cmpl %[eptr], %[aptr];"
    "jl loop;"
    : // no output
    : [aptr] "r" (arr),
      [eptr] "r" (arr + ARR_LEN),
      [incr] "r" (STEP_SIZE),
      [sval] "i" (SOME_VALUE)
    : "cc", "memory"
);

delete[] arr;

As mentioned in the comments, it is true that this assembler code is more of a do {...} while loop, but it does in fact do the same work. 如注释中所述，确实，此汇编代码更像是do {...} while循环，但实际上它执行相同的工作。

The strange thing about that piece of code really is, that it worked fine for me at first. 那段代码的真正奇怪之处在于，一开始它对我来说效果很好。 But when I later tried to make it work in another project, it just seemed as if it wouldn't do anything. 但是当我后来试图使其在另一个项目中工作时，似乎似乎什么也做不了。 I even made some 1:1 copies of the working project, compiled again and... still the result is random. 我什至制作了工作项目的1：1副本，再次进行编译，...结果仍然是随机的。

Maybe I took the wrong constraints for the input operands, but I've actually tried nearly all of them by now and I have no real idea left. 也许我对输入操作数使用了错误的约束，但是到目前为止，我实际上已经尝试了几乎所有输入操作数，而且我还没有真正的想法。 What puzzles me in particular is, that it still works in some cases. 特别令我困惑的是，它在某些情况下仍然有效。

I am not an expert at ASM whatsoever, although I learned it when I was still at university. 尽管我在上大学时就已经学过，但我并不是ASM的专家。 Please note that I am not looking for optimization - I am just trying to understand how inline assembly works. 请注意，我不是在寻求优化-我只是想了解内联汇编的工作方式。 So here is my question: Is there anything fundamentally wrong with my attempt or did I make a more subtle mistake here? 所以这是我的问题：我的尝试有根本性的错误吗？还是我在这里犯了一个更细微的错误？ Thanks in advance. 提前致谢。

(Working with g++ MinGW Win32 x86 v.4.8.1) （使用g ++ MinGW Win32 x86 v.4.8.1）

Update 更新资料

I have already tried out every single suggestion that has been contributed here so far. 到目前为止，我已经尝试了所有在此提出的建议。 In particular I tried 我特别尝试过

using the "q" operand constraint instead of "r", sometimes it works, sometimes it doesn't, 使用“ q”操作数约束而不是“ r”，有时会起作用，有时却不起作用，
writing ... : [aptr] "=r" (arr) : "0" (arr) ... instead, same result, 写... : [aptr] "=r" (arr) : "0" (arr) ...相反，相同的结果，
or even ... : [aptr] "+r" (arr) : ... , still the same. 甚至... : [aptr] "+r" (arr) : ...仍然相同。

Meanwhile I know the official documentation pretty much by heart, but I still can't see my error. 同时，我非常了解官方文档，但仍然看不到我的错误。

Answer 1

You are modifying an input operand ( aptr ) which is not allowed. 您正在修改不允许的输入操作数（ aptr ）。 Either constrain it match an output operand or change it to an input/output operand. 约束它与输出操作数匹配或将其更改为输入/输出操作数。

Answer 2

Here is a complete code that has the intended behavior. 这是具有预期行为的完整代码。

Note that the code is written for a 64-bit machine. 请注意，该代码是为64位计算机编写的。 Therefore, for example %%rbx is used instead of %%ebx as the base address for the array. 因此，例如使用%%rbx代替%%ebx作为阵列的基地址。 For the same reason leaq and cmpq should be used instead of leal and cmpl . 出于相同的原因，应使用leaq和cmpq代替leal和cmpl 。
movq should be used since the array is of type long . 由于数组的类型为long应使用movq 。
Type long is 8 byte not 4 byte on a 64-bit machine. 在64位计算机上， long类型是8字节而不是4字节。
jl in the question should be changed to jg . 问题中的jl应该更改为jg 。
Register labels can not be used since they will be replaced by the compiler with the 32-bit version of the chosen register (eg, ebx ). 不能使用寄存器标签，因为编译器将使用所选寄存器的32位版本（例如ebx ）替换它们。

Constraint "r" can not be used. 不能使用约束"r" 。 "r" means any register can be used, however not any combination of registers is acceptable for leaq . "r"表示可以使用任何寄存器，但是leaq不能接受任何寄存器组合。 Look here: x86 addressing modes 在这里看： x86寻址模式

 #include <iostream> using namespace std; int main(){ int ARR_LEN=20; int STEP_SIZE=2; long SOME_VALUE=100; long* arr = new long[ARR_LEN]; int i; for (i=0; i<ARR_LEN; i++){ arr[i] = 0; } __asm__ __volatile__ ( "loop:" "movq %%rdx, (%%rbx);" "leaq (%%rbx, %%rcx, 8), %%rbx;" "cmpq %%rbx, %%rax;" "jg loop;" : // no output : "b" (arr), "a" (arr+ARR_LEN), "c" (STEP_SIZE), "d" (SOME_VALUE) : "cc", "memory" ); for (i=0; i<ARR_LEN; i++){ cout << "element " << i << " is " << arr[i] << endl; } delete[] arr; return 0; }

Answer 3

How about an answer that works for both x86 and x64 (although it does assume longs are always 4 bytes, a la Windows)? 对于x86和x64都有效的答案如何（尽管它确实假设long始终为4字节，例如Windows）？ The main change from the OP is using "+r" and (temp). OP的主要更改是使用“ + r”和（温度）。

#include <iostream>

using namespace std;

int main(){

  int ARR_LEN=20;
  size_t STEP_SIZE=2;
  long SOME_VALUE=100;

  long* arr = new long[ARR_LEN];

  for (int i=0; i<ARR_LEN; i++){
    arr[i] = 0;
  }

  long* temp = arr;

   asm volatile (
      "loop:\n\t"
      "movl %[sval], (%[aptr])\n\t"
      "lea (%[aptr], %[incr], %c[size]), %[aptr]\n\t"
      "cmp %[eptr], %[aptr]\n\t"
      "jl loop\n\t"
      : [aptr] "+r" (temp)
      : [eptr] "r" (arr + ARR_LEN),
        [incr] "r" (STEP_SIZE),
        [sval] "i" (SOME_VALUE),
        [size] "i" (sizeof(long))
      : "cc", "memory"
   );

  for (int i=0; i<ARR_LEN; i++){
    cout << "element " << i << " is " << arr[i] << endl;
  }

  delete[] arr;

  return 0;
}

GCC内联汇编中的一个简单while循环

问题描述

3 个解决方案

解决方案1
2 2013-08-30 09:09:23

解决方案2
2 2014-11-09 01:27:02

解决方案3
1 2014-11-09 05:37:27

GCC内联汇编中的一个简单while循环

问题描述

3 个解决方案

解决方案1 2 2013-08-30 09:09:23

解决方案2 2 2014-11-09 01:27:02

解决方案3 1 2014-11-09 05:37:27

解决方案1
2 2013-08-30 09:09:23

解决方案2
2 2014-11-09 01:27:02

解决方案3
1 2014-11-09 05:37:27