简体   繁体   English

如何使简单的float4添加到gcc / mingw内联汇编中

[英]how to make simple float4 add in gcc/mingw inline assembly

I need to get something like this probably but in strange GCC inline assembly 我可能需要获得类似的内容,但在奇怪的GCC内联汇编中

void add4(float* a, float* b, float* out) 
{ 
  mov edx, [esp+4] 
  movaps  xmm0, oword [edx+0]   
  mov edx, [esp+8] 
  movaps  xmm1, oword [edx+0]   
  addps  xmm0, xmm1 
  mov edx, [esp+12] 
  movaps oword [edx+0], xmm0 
  ret 
} 

1) one topic is how to pack this in gcc inline syntax 2) second topic is how to rewrite it (maybe getting rid of explicit memory access) to make such inlined routine well integrate with surrounding GCC (mingw32) code 1)一个主题是如何在gcc内联语法中包装该主题2)第二个主题是如何对其进行重写(也许摆脱显式的内存访问),以使此类内联例程与周围的GCC(mingw32)代码很好地集成在一起

There are two ways to do this. 有两种方法可以做到这一点。

The first way would be to use the asm keyword to include inline assembly as a literal string. 第一种方法是使用asm关键字将内联汇编作为文字字符串包括在内。 You can also pass in the function parameters, and GCC will generate the necessary code to access them. 您也可以传入函数参数,GCC将生成访问它们的必要代码。 This will save you from having to manually use the memory accesses, especially when dealing with different calling convention. 这将使您不必手动使用内存访问,特别是在处理不同的调用约定时。 This is the general way for embedding assembly in C functions. 这是在C函数中嵌入汇编的一般方法。

The second way, which is more specific to what you're trying to do, is to use SSE intrinsics (provided by <xmmintrin.h> . The resulting code looks like normal C function calls, but the compiler will generate the corresponding instructions instead of a bunch of function calls. See the Intel Intrinsics Guide for more info on how to use these intrinsics. 第二种方法,更具体地是您要执行的操作,是使用SSE内部函数(由<xmmintrin.h> 。生成的代码看起来像普通的C函数调用,但是编译器将改为生成相应的指令一堆函数调用。有关如何使用这些内在函数的更多信息,请参阅《 英特尔内在指南》

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM