简体   繁体   English

将数字与按位运算符相乘时的错误

[英]Bugs when multiplying numbers with bitwise operators

I am trying to multiply two floating-point numbers using bitwise operators in IEEE-754 format. 我试图使用IEEE-754格式的按位运算符乘以两个浮点数。 The 32-bit number is composed in the form sign - exponent - mantissa . 32位数字以符号 - 指数 - 尾数的形式组成。 After multiplying each number, the resultant answer is correct some of the time but not all of the time. 在将每个数字相乘之后,得到的答案在某些时间是正确的,但不是在所有时间都是正确的。

I think it has something to do with the resulting answer not being in normalized form (eg 1.1010101 * 2 5 ), but I don't know how to fix it. 我认为它与得到的答案没有处于标准化形式(例如1.1010101 * 2 5 )有关,但我不知道如何解决它。

#include <csdtdio>

struct Real
{    
   int sign;
   long exponent;
   unsigned long fraction;
};

Real Multiply(Real Val1, Real Val2){
   Real answer;
   answer.fraction = left.fraction + right.fraction;
   answer.exponent = left.exponent  + right.exponent;
   answer.sign = left.sign ^ right.sign;
   return  answer;
}

While multiplying the mantissa parts must be multiplied together, not add 虽然乘以尾数部分必须相乘,而不是相加

(-1) sign1 × 2 exp1 × mantissa1 * (-1) sign2 × 2 exp2 × mantissa2 (-1) sign1 ×2 exp1 × mantissa1 *( - 1) sign2 ×2 exp2 × mantissa2
= (-1) sign1 + sign2 × 2 exp1 + exp2 × mantissa1 × mantissa2 =(-1) sign1 + sign2 ×2 exp1 + exp2 × mantissa1 × mantissa2

And you don't need a separate variable for returning 并且您不需要单独的变量来返回

Real Multiply(Real Val1, Real Val2){
   Val1.fraction *= Val2.fraction;
   Val1.exponent += Val2.exponent;
   Val1.sign ^= Val2.sign;
   return Val1;
}

After those basic things you'll still have to do normalization, for which you need to get the full result instead of just the low bits like the normal non-widening multiplication. 在那些基本的东西之后,你仍然需要进行标准化,为此你需要获得完整的结果而不是像正常的非加宽乘法这样的低位。 Therefore you must cast your "fraction" (if you're using IEEE-754 then the correct term for it is significand ) to a wider type. 因此,您必须将“分数”(如果您使用的是IEEE-754,那么它的正确术语是有意义的 )转换为更广泛的类型。 Depending on which platform you're on, you may or may not have a type twice as big as an unsigned long . 根据您所使用的平台,您可能拥有或不拥有两倍于unsigned long It's better to used fixed-width types like int32_t , uint64_t in this case. 在这种情况下,最好使用固定宽度类型,如int32_tuint64_t That's all the hints needed to do this 这就是完成这项工作所需的所有提示

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM