简体   繁体   English

内联装配; 浮点按位运算; 这里出了什么问题?

[英]Inline assembly; Bitwise operation on float; What's going awry here?

This simple piece of code is my problem: 这段简单的代码是我的问题:

Extended asm (gcc); 扩展的asm(gcc); Intel syntax (-masm=intel); 英特尔语法(-masm = intel); Platform - x86 平台 - x86

What it should do: Return a float with length of one and the sign (+-) the same as x's. 它应该做什么:返回一个长度为1的浮点数和符号(+ - )与x的相同。

    float signf(float x)
    {
      float r = 1;
      asm volatile (
            "and %1,0x80000000;"
            "or %0,%1;"
            :"=r"(r):"r"(x));
      return r;
    }

Calling it with an arbitrary random number chosen by a fair dice roll gives: 用公平骰子卷选择的任意随机数调用它给出:

    signf of -1352353.3253: -5.60519e-045

The actual problem with your inline asm is that you declare r as output only, so the compiler will optimize away the initialization. 你的内联asm的实际问题是你只将r声明为输出,因此编译器将优化掉初始化。 You should use "+r" constraint instead of "=r" and it should work. 你应该使用"+r"约束而不是"=r"它应该工作。

A better optimized version could look like: 更好的优化版本可能如下所示:

float signf(float x)
{
    float r;
    __asm__  __volatile__ (
            "and %0, 0x80000000;"
            "or %0, 0x3f800000;"
            :"=r"(r):"0"(x));
    return r;
}

Note that this function involves float->int->float conversion (through memory) which may affect performance. 请注意,此函数涉及float-> int-> float转换(通过内存),这可能会影响性能。

The C version of the above code is: 上述代码的C版本是:

float signf(float x)
{
    union { float f; int i; } tmp, res;
    tmp.f = x;
    res.f = 1;
    res.i |= tmp.i & 0x80000000;
    return res.f;
}

This generates identical code for me (using gcc 4.4.5). 这为我生成了相同的代码(使用gcc 4.4.5)。

The simple C approach return x < 0 ? -1 : 1; 简单的C方法return x < 0 ? -1 : 1; return x < 0 ? -1 : 1; generates full FPU code without conversion or memory accesses (except for loading the operand) so might perform better. 生成完整的FPU代码而无需转换或内存访问(加载操作数除外),因此可能表现更好。 It also uses fcmov if available to avoid branching. 它还使用fcmov如果可用)以避免分支。 Needs some benchmarking. 需要一些基准测试。

There are two C++ functions for this in C++11: 在C ++ 11中有两个C ++函数:

bool std::signbit (x);

http://en.cppreference.com/w/cpp/numeric/math/signbit http://en.cppreference.com/w/cpp/numeric/math/signbit

or, 要么,

float f = std::copysign (1.0f, x);

http://en.cppreference.com/w/cpp/numeric/math/copysign http://en.cppreference.com/w/cpp/numeric/math/copysign

This seems to work well (AT&T syntax): 这看起来效果很好(AT&T语法):

float signf(float x)
{
  float r = 1;
  asm ("andl $0x80000000, %1\n"
       "\torl %1, %0\n"
       :"+r"(r):"r"(x));
  return r;
}

TBH, I would use copysignf() as suggested by others. TBH,我会按照别人的建议使用copysignf() What you are trying to do is unportable both because it is tied only to IA-32 platform and C++ compilers that can do this asm() statement. 您要做的事情是不可移植的,因为它只与IA-32平台和可以执行此asm()语句的C ++编译器相关联。

EDIT 1 编辑1

BTW, the following version works the same (and generates pretty much the same instructions as the above asm() statement) and is free of non-portable stuff and type aliasing issues (unlike the union based or reinterpret_cast<> based versions suggested by others). 顺便说一句,以下版本的工作方式相同(并生成与上述asm()语句几乎相同的指令)并且没有非可移植的东西和类型别名问题(不同于其他人建议的基于unionreinterpret_cast<>的版本) )。

float signf3(float x)
{
  unsigned u;
  std::memcpy(&u, &x, sizeof (u)) ;

  float r = 1.f;
  unsigned uone;
  std::memcpy(&uone, &r, sizeof (uone));

  uone |= u & 0x80000000;

  std::memcpy(&r, &uone, sizeof (r));
  return r;
}

这个问题标记为C ++,因此我将提供两个C ++建议,您可以让编译器进行优化:

  • return x < 0.0f ? -1.0f : 1.0f;
  • return x / std::abs(x); // I believe self-division shouldn't cause 'almost 1.0' numbers to be genereated

You do not need to use asm for this. 您不需要为此使用asm。 The following does what you tried to do (even the correct result for -0.0f). 以下是您尝试执行的操作(即使是-0.0f的正确结果)。

float signf(float x) {
    bool sign=(0!=(*(reinterpret_cast<uint32_t *>(&x)) & 0x80000000));
    return sign? -1.0f : 1.0f;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM