[英]convert unsigned integer to float in python
I wrote a socket server that reads data from some devices. 我写了一个套接字服务器,可以从某些设备读取数据。 After reading data binary shift is applied on bytes.
读取数据后,将二进制移位应用于字节。 After that i get an integer value for instance
1108304047
and i want to convert this number to IEEE 754 float 35.844417572021484. 之后,我得到一个整数值,例如
1108304047
,我想将此数字转换为IEEE 754浮点数35.844417572021484。 I found some solutions with struct.unpack but it doesn't seem to me rational. 我用struct.unpack找到了一些解决方案,但在我看来这并不合理。 First we convert number to string then convert to float.
首先,我们将数字转换为字符串,然后转换为浮点数。
Is there any short way like Float.intBitsToFloat(1108304047)
in Java. 有没有像Java中的
Float.intBitsToFloat(1108304047)
这样的简短方法。
The solution that i found with struct.unpack is quite long. 我用struct.unpack找到的解决方案很长。 it contains string conversion, sub string fetching, zero filling etc..
它包含字符串转换,子字符串提取,零填充等。
def convert_to_float(value):
return struct.unpack("!f", hex(value)[2:].zfill(8).decode('hex'))[0]
As you can see in Java they are using structure to do the trick. 如您在Java中所看到的,它们正在使用结构来完成技巧。
/*
* Find the float corresponding to a given bit pattern
*/
JNIEXPORT jfloat JNICALL
Java_java_lang_Float_intBitsToFloat(JNIEnv *env, jclass unused, jint v)
{
union {
int i;
float f;
} u;
u.i = (long)v;
return (jfloat)u.f;
}
In Python it is not possible to do this way, hence you need to use the struct
library 在Python中不可能这样做,因此您需要使用
struct
库
This module performs conversions between Python values and C structs represented as Python strings
此模块在Python值和以Python字符串表示的C结构之间执行转换
First the number is converted to representation of long
首先将数字转换为
long
表示形式
packed_v = struct.pack('>l', b)
and then is unpacked to float
然后解压后
float
f = struct.unpack('>f', packed_v)[0]
That's similar as in Java. 这与Java中的相似。
def intBitsToFloat(b):
s = struct.pack('>l', b)
return struct.unpack('>f', s)[0]
Please correct me if I'm wrong. 如果我错了,请纠正我。
ldexp
and frexp
decompositions for positive numbers. ldexp
和frexp
分解为正数。 If you are okay with up to 2 -16 amount of relative error, you can express both sides of the transformation using just basic arithmetic and the ld/frexp
decomposition. 如果您可以接受2 -16的相对误差,则可以仅使用基本算术和
ld/frexp
分解来表示转换的两面。
Note that this is much slower than the struct hack, which can be more succinctly represented as struct.unpack('f', struct.pack('I', value))
. 请注意,这比struct hack慢得多,后者可以更简洁地表示为
struct.unpack('f', struct.pack('I', value))
。
Here is the decomposition method. 这是分解方法。
def to_bits(x):
man, exp = math.frexp(x)
return int((2 * man + (exp + 125)) * 0x800000)
def from_bits(y):
y -= 0x3e800000
return math.ldexp(
float(0x800000 + y & 0x7fffff) / 0x1000000,
(y - 0x800000) >> 23)
While the from_bits
function looks scarier, it is actually nothing more than the inverse of to_bits
, modified so that we only perform a single floating point division (not because of speed considerations, just because it should be the sort of mindset we have when we do need to work with machine representations of floats). 尽管
from_bits
函数看起来更from_bits
,但实际上只不过是to_bits
的反to_bits
,已修改,因此我们仅执行单个浮点除法(不是to_bits
速度方面的考虑,只是因为它应该是我们做时的思维定势)需要使用浮点数的机器表示)。 Therefore, I'll focus on explaining the forward transformation. 因此,我将重点介绍正向变换。
Recall that a (positive) IEEE 754 floating point number is represented as a tuple of a biased exponent and its mantissa. 回想一下,(正)IEEE 754浮点数表示为有偏指数及其尾数的元组。 The lower 23 bits m are the mantissa, and the upper 8 bits e (minus the most significant bit, which we assume to always be zero) represent the exponent, so that
较低的23位m是尾数,较高的8位e (减去最高有效位,我们假定始终为零)表示指数,因此
x = (1 + m / 2 23 ) * 2 e - 127
x =(1 + m / 2 23 )* 2 e -127
Let man' = m / 2 23 and exp' = e - 127, then 0 <= man' < 1 and exp' is an integer. 设man' = m / 2 23和exp' = e -127,则0 <= man' <1并且exp'是整数。 Therefore
因此
( man' + exp' + 127) * 2 23
( man' + exp' + 127)* 2 23
gives the IEEE 754 representation. 给出IEEE 754表示形式。
On the other hand, the frexp
decomposition computes a pair man, exp = frexp(x)
such that man * 2 exp = x, and 0.5 <= man < 1. 另一方面,
frexp
分解计算出一对man, exp = frexp(x)
使得man * 2 exp = x,并且0.5 <= man <1。
A moment of thought will show that man' = 2 * man - 1 and exp' = exp - 1, therefore its IEEE machine representation is 片刻的思考将表明man' = 2 * man -1和exp' = exp -1,因此其IEEE机器表示为
( man' + exp' + 127) * 0x800000 = (2 * man + exp + 125) * 0x800000
( man' + exp' + 127)* 0x800000 =(2 * man + exp + 125)* 0x800000
How much roundoff error do we expect? 我们期望多少舍入误差? Well, let's assume that
frexp
introduces no error within its decomposition. 好吧,假设
frexp
在其分解过程中没有引入错误。 This is unfortunately impossible, but we can relax this down the line. 不幸的是,这是不可能的,但是我们可以放松这一点。
The main feature is the computation 2 * man + (exp + 125)
. 主要特征是计算
2 * man + (exp + 125)
。 Why? 为什么?
0x800000
is an perfect power of two, and therefore a floating point multiplication of a power of two will nearly always be lossless (unless we overflow), since the FPU is just adding 23 << 23
to its machine representation (without touching the Mantissa, which is when error arise). 0x800000
是2的完美幂,因此2的浮点乘法几乎总是无损的(除非我们溢出),因为FPU只是在其机器表示中添加23 << 23
(而不触碰尾数,这是发生错误的时间)。 Similarly, the multiplication 2 * man
is also lossless (akin to just adding 1 << 23
to the machine representation). 同样,乘积
2 * man
也是无损的(类似于将1 << 23
加到机器表示上)。 Furthermore, exp
and 125 are integers, so (exp + 125)
is also computed to exact precision. 此外,
exp
和125是整数,因此(exp + 125)
也可以精确计算。
Therefore, we are left to analyze the error behavior of m + e
, where 1 <= m
< 1 and |e| 因此,我们需要分析
m + e
的错误行为,其中1 <= m
<1并且| e | < 127. In the worst case, m
has all 23 bits filled (corresponding to m = 2 - 2 -22 ) and e
= +/- 127. Here, this addition will unfortunately clobber the 8 least significant bits of m
, since it has to renormalize m
(which is at the exponential range of 2 0 ) to the exponential range of 2 8 , which means losing 8 bits. <127。在最坏的情况下,
m
填充了所有23位(对应于m = 2-2 -22 ),并且e
= +/-127。在这里,不幸的是,此加法将破坏m
的8个最低有效位,因为必须将m
(在2 0的指数范围内)重新归一化为2 8的指数范围,这意味着丢失8位。 However, since a mantissa has 24 significant bits, we effectively lose 2 -(24 - 8) amount of precision, which upper-bounds the error. 但是,由于尾数具有24个有效位,因此我们实际上损失了2- (24-8)的精度,这使误差超出了上限。
In a similar line of reasoning for from_bits
, you can show that float(0x800000 + y & 0x7fffff)
is basically computing the operation (1.0f + m), where m
may have up to 23 bits of precision and it is strictly less than 1. Therefore, we're adding a precise number at the scale of 2 0 with another number at the scale of 2 -1 , so we expect a loss of one bit. 在
from_bits
的类似推理行中,您可以证明float(0x800000 + y & 0x7fffff)
基本上是在计算运算(1.0f + m),其中m
可能具有23位的精度,并且严格小于1因此,我们将以2 0的小数位数添加一个精确数字,然后以2 -1的小数位数添加一个精确数字,因此我们希望损失1位。 This then suggests that we would incur up to 2 -22 relative error in the backwards transformation. 这表明在反向转换中我们将产生2 -22的相对误差。
Both of these transformations incur very little roundoff, and if you throw in an extra multiplication into to_bits
, you can also bring its error down to just 2 -22 . 这两个转换几乎不需要舍入,并且如果将额外的乘法放入
to_bits
,则还可以将其误差降低到2 -22 。
Do not do this in production. 不要在生产中这样做。
This is just a clever float-hack that seems fun. 这只是一个看起来很有趣的聪明的浮动hack。 It's not meant to be anything more than that.
但这不意味着更多。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.