简体   繁体   English

GCC内联汇编中的一个简单while循环

[英]A simple while-loop in GCC inline assembly

I want to write the following loop using GCC extended inline ASM: 我想使用GCC扩展内联ASM编写以下循环:

long* arr = new long[ARR_LEN]();
long* act_ptr = arr;
long* end_ptr = arr + ARR_LEN;

while (act_ptr < end_ptr)
{
    *act_ptr = SOME_VALUE;
    act_ptr += STEP_SIZE;
}

delete[] arr;

An array of type long with length ARR_LEN is allocated and zero-initialized. 分配长度为ARR_LEN long类型的数组,并将其初始化为零。 The loop walks through the array with an increment of STEP_SIZE . 循环以STEP_SIZE的增量STEP_SIZE数组。 Every touched element is set to SOME_VALUE . 每个触摸的元素都设置为SOME_VALUE

Well, this was my first attempt in GAS: 好吧,这是我第一次在GAS中尝试:

long* arr = new long[ARR_LEN]();

asm volatile
(
    "loop:"
    "movl %[sval], (%[aptr]);"
    "leal (%[aptr], %[incr], 4), %[aptr];"
    "cmpl %[eptr], %[aptr];"
    "jl loop;"
    : // no output
    : [aptr] "r" (arr),
      [eptr] "r" (arr + ARR_LEN),
      [incr] "r" (STEP_SIZE),
      [sval] "i" (SOME_VALUE)
    : "cc", "memory"
);

delete[] arr;

As mentioned in the comments, it is true that this assembler code is more of a do {...} while loop, but it does in fact do the same work. 如注释中所述,确实,此汇编代码更像是do {...} while循环,但实际上它执行相同的工作。

The strange thing about that piece of code really is, that it worked fine for me at first. 那段代码的真正奇怪之处在于,一开始它对我来说效果很好。 But when I later tried to make it work in another project, it just seemed as if it wouldn't do anything. 但是当我后来试图使其在另一个项目中工作时,似乎似乎什么也做不了。 I even made some 1:1 copies of the working project, compiled again and... still the result is random. 我什至制作了工作项目的1:1副本,再次进行编译,...结果仍然是随机的。

Maybe I took the wrong constraints for the input operands, but I've actually tried nearly all of them by now and I have no real idea left. 也许我对输入​​操作数使用了错误的约束,但是到目前为止,我实际上已经尝试了几乎所有输入操作数,而且我还没有真正的想法。 What puzzles me in particular is, that it still works in some cases. 特别令我困惑的是,它在某些情况下仍然有效。

I am not an expert at ASM whatsoever, although I learned it when I was still at university. 尽管我在上大学时就已经学过,但我并不是ASM的专家。 Please note that I am not looking for optimization - I am just trying to understand how inline assembly works. 请注意,我不是在寻求优化-我只是想了解内联汇编的工作方式。 So here is my question: Is there anything fundamentally wrong with my attempt or did I make a more subtle mistake here? 所以这是我的问题:我的尝试有根本性的错误吗?还是我在这里犯了一个更细微的错误? Thanks in advance. 提前致谢。

(Working with g++ MinGW Win32 x86 v.4.8.1) (使用g ++ MinGW Win32 x86 v.4.8.1)

Update 更新资料

I have already tried out every single suggestion that has been contributed here so far. 到目前为止,我已经尝试了所有在此提出的建议。 In particular I tried 我特别尝试过

  • using the "q" operand constraint instead of "r", sometimes it works, sometimes it doesn't, 使用“ q”操作数约束而不是“ r”,有时会起作用,有时却不起作用,
  • writing ... : [aptr] "=r" (arr) : "0" (arr) ... instead, same result, ... : [aptr] "=r" (arr) : "0" (arr) ...相反,相同的结果,
  • or even ... : [aptr] "+r" (arr) : ... , still the same. 甚至... : [aptr] "+r" (arr) : ...仍然相同。

Meanwhile I know the official documentation pretty much by heart, but I still can't see my error. 同时,我非常了解官方文档 ,但仍然看不到我的错误。

You are modifying an input operand ( aptr ) which is not allowed. 您正在修改不允许的输入操作数( aptr )。 Either constrain it match an output operand or change it to an input/output operand. 约束它与输出操作数匹配或将其更改为输入/输出操作数。

Here is a complete code that has the intended behavior. 这是具有预期行为的完整代码。

  • Note that the code is written for a 64-bit machine. 请注意,该代码是为64位计算机编写的。 Therefore, for example %%rbx is used instead of %%ebx as the base address for the array. 因此,例如使用%%rbx代替%%ebx作为阵列的基地址。 For the same reason leaq and cmpq should be used instead of leal and cmpl . 出于相同的原因,应使用leaqcmpq代替lealcmpl
  • movq should be used since the array is of type long . 由于数组的类型为long应使用movq
  • Type long is 8 byte not 4 byte on a 64-bit machine. 在64位计算机上, long类型是8字节而不是4字节。
  • jl in the question should be changed to jg . 问题中的jl应该更改为jg
  • Register labels can not be used since they will be replaced by the compiler with the 32-bit version of the chosen register (eg, ebx ). 不能使用寄存器标签,因为编译器将使用所选寄存器的32位版本(例如ebx )替换它们。
  • Constraint "r" can not be used. 不能使用约束"r" "r" means any register can be used, however not any combination of registers is acceptable for leaq . "r"表示可以使用任何寄存器,但是leaq不能接受任何寄存器组合。 Look here: x86 addressing modes 在这里看: x86寻址模式

     #include <iostream> using namespace std; int main(){ int ARR_LEN=20; int STEP_SIZE=2; long SOME_VALUE=100; long* arr = new long[ARR_LEN]; int i; for (i=0; i<ARR_LEN; i++){ arr[i] = 0; } __asm__ __volatile__ ( "loop:" "movq %%rdx, (%%rbx);" "leaq (%%rbx, %%rcx, 8), %%rbx;" "cmpq %%rbx, %%rax;" "jg loop;" : // no output : "b" (arr), "a" (arr+ARR_LEN), "c" (STEP_SIZE), "d" (SOME_VALUE) : "cc", "memory" ); for (i=0; i<ARR_LEN; i++){ cout << "element " << i << " is " << arr[i] << endl; } delete[] arr; return 0; } 

How about an answer that works for both x86 and x64 (although it does assume longs are always 4 bytes, a la Windows)? 对于x86和x64都有效的答案如何(尽管它确实假设long始终为4字节,例如Windows)? The main change from the OP is using "+r" and (temp). OP的主要更改是使用“ + r”和(温度)。

#include <iostream>

using namespace std;

int main(){

  int ARR_LEN=20;
  size_t STEP_SIZE=2;
  long SOME_VALUE=100;

  long* arr = new long[ARR_LEN];

  for (int i=0; i<ARR_LEN; i++){
    arr[i] = 0;
  }

  long* temp = arr;

   asm volatile (
      "loop:\n\t"
      "movl %[sval], (%[aptr])\n\t"
      "lea (%[aptr], %[incr], %c[size]), %[aptr]\n\t"
      "cmp %[eptr], %[aptr]\n\t"
      "jl loop\n\t"
      : [aptr] "+r" (temp)
      : [eptr] "r" (arr + ARR_LEN),
        [incr] "r" (STEP_SIZE),
        [sval] "i" (SOME_VALUE),
        [size] "i" (sizeof(long))
      : "cc", "memory"
   );

  for (int i=0; i<ARR_LEN; i++){
    cout << "element " << i << " is " << arr[i] << endl;
  }

  delete[] arr;

  return 0;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM