简体   繁体   English

什么是次正规浮点数?

[英]What is a subnormal floating point number?

isnormal() reference page tells : isnormal() 参考页告诉:

Determines if the given floating point number arg is normal, ie is neither zero, subnormal, infinite, nor NaN.确定给定的浮点数 arg 是否正常,即既不是零、次正规、无限也不是 NaN。

A number being zero, infinite or NaN is clear what it means.数字为零、无穷大或 NaN 很清楚它的含义。 But it also says subnormal.但它也说不正常。 When is a number subnormal?什么时候一个数低于正规数?

IEEE 754 basics IEEE 754 基础知识

First let's review the basics of IEEE 754 numbers are organized.首先让我们回顾一下 IEEE 754 编号组织的基础知识。

We'll focus on single precision (32-bit), but everything can be immediately generalized to other precisions.我们将专注于单精度(32 位),但一切都可以立即推广到其他精度。

The format is:格式为:

  • 1 bit: sign 1 位:符号
  • 8 bits: exponent 8 位:指数
  • 23 bits: fraction 23 位:分数

Or if you like pictures:或者如果你喜欢图片:

在此处输入图片说明

Source .来源

The sign is simple: 0 is positive, and 1 is negative, end of story.符号很简单:0 为正,1 为负,故事结束。

The exponent is 8 bits long, and so it ranges from 0 to 255.指数是 8 位长,所以它的范围是从 0 到 255。

The exponent is called biased because it has an offset of -127 , eg:指数被称为有偏差,因为它的偏移量为-127 ,例如:

  0 == special case: zero or subnormal, explained below
  1 == 2 ^ -126
    ...
125 == 2 ^ -2
126 == 2 ^ -1
127 == 2 ^  0
128 == 2 ^  1
129 == 2 ^  2
    ...
254 == 2 ^ 127
255 == special case: infinity and NaN

The leading bit convention前导位约定

(What follows is a fictitious hypothetical narrative, not based on any actual historical research.) (以下是虚构的假设叙述,并非基于任何实际的历史研究。)

While designing IEEE 754, engineers noticed that all numbers, except 0.0 , have a one 1 in binary as the first digit.在设计IEEE 754,工程师发现的所有号码,除了0.0 ,有一个1二进制作为第一个数字。 Eg:例如:

25.0   == (binary) 11001 == 1.1001 * 2^4
 0.625 == (binary) 0.101 == 1.01   * 2^-1

both start with that annoying 1. part.两者都从烦人的1.部分开始。

Therefore, it would be wasteful to let that digit take up one precision bit almost every single number.因此,让该数字几乎每个数字都占用一个精度位将是一种浪费。

For this reason, they created the "leading bit convention":出于这个原因,他们创建了“领先的位约定”:

always assume that the number starts with one始终假设数字以 1 开头

But then how to deal with 0.0 ?但是如何处理0.0呢? Well, they decided to create an exception:好吧,他们决定创建一个例外:

  • if the exponent is 0如果指数为 0
  • and the fraction is 0分数是0
  • then the number represents plus or minus 0.0那么数字代表正负0.0

so that the bytes 00 00 00 00 also represent 0.0 , which looks good.这样字节00 00 00 00也代表0.0 ,看起来不错。

If we only considered these rules, then the smallest non-zero number that can be represented would be:如果我们只考虑这些规则,那么可以表示的最小非零数将是:

  • exponent: 0指数:0
  • fraction: 1分数:1

which looks something like this in an hex fraction due to the leading bit convention:由于前导位约定,它在十六进制分数中看起来像这样:

1.000002 * 2 ^ (-127)

where .000002 is 22 zeroes with a 1 at the end.其中.000002是 22 个零,最后是1

We cannot take fraction = 0 , otherwise that number would be 0.0 .我们不能采用fraction = 0 ,否则该数字将为0.0

But then the engineers, who also had a keen aesthetic sense, thought: isn't that ugly?但后来同样具有敏锐审美的工程师们想:那是不是很丑? That we jump from straight 0.0 to something that is not even a proper power of 2?我们从直接的0.0跳到甚至不是 2 的适当幂的东西? Couldn't we represent even smaller numbers somehow?我们不能以某种方式表示更小的数字吗? (OK, it was a bit more concerning than "ugly": it was actually people were getting bad results for their computations, see "How subnormals improve computations" below). (好吧,这比“丑陋”更令人担忧:实际上人们的计算结果很差,请参阅下面的“次正规数如何改进计算”)。

Subnormal numbers次正规数

The engineers scratched their heads for a while, and came back, as usual, with another good idea.工程师们挠了挠头,像往常一样带着另一个好主意回来了。 What if we create a new rule:如果我们创建一个新规则会怎样:

If the exponent is 0, then:如果指数为 0,则:

  • the leading bit becomes 0前导位变为 0
  • the exponent is fixed to -126 (not -127 as if we didn't have this exception)指数固定为 -126(不是 -127,好像我们没有这个例外)

Such numbers are called subnormal numbers (or denormal numbers which is synonym).这样的数字称为次正规数(或同义词的非正规数)。

This rule immediately implies that the number such that:此规则立即暗示该数字满足以下条件:

  • exponent: 0指数:0
  • fraction: 0分数:0

is still 0.0 , which is kind of elegant as it means one less rule to keep track of.仍然是0.0 ,这有点优雅,因为它意味着要跟踪的规则少了。

So 0.0 is actually a subnormal number according to our definition!所以根据我们的定义, 0.0实际上是一个次正规数!

With this new rule then, the smallest non-subnormal number is:有了这个新规则,最小的非次正规数是:

  • exponent: 1 (0 would be subnormal)指数:1(0 将低于正常值)
  • fraction: 0分数:0

which represents:这代表:

1.0 * 2 ^ (-126)

Then, the largest subnormal number is:那么,最大的次正规数是:

  • exponent: 0指数:0
  • fraction: 0x7FFFFF (23 bits 1)分数:0x7FFFFF(23 位 1)

which equals:这等于:

0.FFFFFE * 2 ^ (-126)

where .FFFFFE is once again 23 bits one to the right of the dot.其中.FFFFFE再次是点右侧 1 的 23 位。

This is pretty close to the smallest non-subnormal number, which sounds sane.这非常接近最小的非次正规数,这听起来很正常。

And the smallest non-zero subnormal number is:最小的非零次正规数是:

  • exponent: 0指数:0
  • fraction: 1分数:1

which equals:这等于:

0.000002 * 2 ^ (-126)

which also looks pretty close to 0.0 !这看起来也非常接近0.0

Unable to find any sensible way to represent numbers smaller than that, the engineers were happy, and went back to viewing cat pictures online, or whatever it is that they did in the 70s instead.无法找到任何合理的方式来表示比这更小的数字,工程师们很高兴,并回到在线查看猫图片,或者他们在 70 年代所做的任何事情。

As you can see, subnormal numbers do a trade-off between precision and representation length.如您所见,次正规数在精度和表示长度之间进行了权衡。

As the most extreme example, the smallest non-zero subnormal:作为最极端的例子,最小的非零次正规:

0.000002 * 2 ^ (-126)

has essentially a precision of a single bit instead of 32-bits.本质上具有单个位而不是 32 位的精度。 For example, if we divide it by two:例如,如果我们将其除以二:

0.000002 * 2 ^ (-126) / 2

we actually reach 0.0 exactly!我们实际上正好达到了0.0

Visualization可视化

It is always a good idea to have a geometric intuition about what we learn, so here goes.对我们学到的东西有几何直觉总是一个好主意,所以就这样吧。

If we plot IEEE 754 floating point numbers on a line for each given exponent, it looks something like this:如果我们为每个给定的指数在一条线上绘制 IEEE 754 浮点数,它看起来像这样:

          +---+-------+---------------+-------------------------------+
exponent  |126|  127  |      128      |              129              |
          +---+-------+---------------+-------------------------------+
          |   |       |               |                               |
          v   v       v               v                               v
          -------------------------------------------------------------
floats    ***** * * * *   *   *   *   *       *       *       *       *
          -------------------------------------------------------------
          ^   ^       ^               ^                               ^
          |   |       |               |                               |
          0.5 1.0     2.0             4.0                             8.0

From that we can see that:从中我们可以看出:

  • for each exponent, there is no overlap between the represented numbers对于每个指数,表示的数字之间没有重叠
  • for each exponent, we have the same number 2^23 of floating point numbers (here represented by 4 * )对于每个指数,我们有相同数量的 2^23 个浮点数(这里用 4 *表示)
  • within each exponent, points are equally spaced在每个指数内,点等距
  • larger exponents cover larger ranges, but with points more spread out更大的指数覆盖更大的范围,但点更分散

Now, let's bring that down all the way to exponent 0.现在,让我们把它一直降低到指数 0。

Without subnormals, it would hypothetically look like:如果没有次正规,它会假设如下:

          +---+---+-------+---------------+-------------------------------+
exponent  | ? | 0 |   1   |       2       |               3               |
          +---+---+-------+---------------+-------------------------------+
          |   |   |       |               |                               |
          v   v   v       v               v                               v
          -----------------------------------------------------------------
floats    *    **** * * * *   *   *   *   *       *       *       *       *
          -----------------------------------------------------------------
          ^   ^   ^       ^               ^                               ^
          |   |   |       |               |                               |
          0   |   2^-126  2^-125          2^-124                          2^-123
              |
              2^-127

With subnormals, it looks like this:对于次正规,它看起来像这样:

          +-------+-------+---------------+-------------------------------+
exponent  |   0   |   1   |       2       |               3               |
          +-------+-------+---------------+-------------------------------+
          |       |       |               |                               |
          v       v       v               v                               v
          -----------------------------------------------------------------
floats    * * * * * * * * *   *   *   *   *       *       *       *       *
          -----------------------------------------------------------------
          ^   ^   ^       ^               ^                               ^
          |   |   |       |               |                               |
          0   |   2^-126  2^-125          2^-124                          2^-123
              |
              2^-127

By comparing the two graphs, we see that:通过比较两张图,我们看到:

  • subnormals double the length of range of exponent 0 , from [2^-127, 2^-126) to [0, 2^-126)次法线是指数0范围长度的两倍,从[2^-127, 2^-126)[0, 2^-126)

    The space between floats in subnormal range is the same as for [0, 2^-126) .低于正常范围的浮点数之间的空间与[0, 2^-126)

  • the range [2^-127, 2^-126) has half the number of points that it would have without subnormals.范围[2^-127, 2^-126)的点数是没有次法线时的点数的一半。

    Half of those points go to fill the other half of the range.这些点的一半用于填充范围的另一半。

  • the range [0, 2^-127) has some points with subnormals, but none without.范围[0, 2^-127)有一些具有次法线的点,但没有没有。

    This lack of points in [0, 2^-127) is not very elegant, and is the main reason for subnormals to exist! [0, 2^-127)缺少点不是很优雅,并且是次规范存在的主要原因!

  • since the points are equally spaced:因为这些点是等距的:

    • the range [2^-128, 2^-127) has half the points than [2^-127, 2^-126) - [2^-129, 2^-128) has half the points than [2^-128, 2^-127)范围[2^-128, 2^-127)的点数是[2^-127, 2^-126)一半 - [2^-129, 2^-128)的点数是[2^-128, 2^-127)一半[2^-128, 2^-127)
    • and so on等等

    This is what we mean when saying that subnormals are a tradeoff between size and precision.这就是我们所说的次正规是大小和精度之间的权衡时的意思。

Runnable C example可运行的 C 示例

Now let's play with some actual code to verify our theory.现在让我们用一些实际的代码来验证我们的理论。

In almost all current and desktop machines, C float represents single precision IEEE 754 floating point numbers.在几乎所有当前和台式机中,C float表示单精度 IEEE 754 浮点数。

This is in particular the case for my Ubuntu 18.04 amd64 Lenovo P51 laptop.我的 Ubuntu 18.04 amd64 Lenovo P51 笔记本电脑尤其如此。

With that assumption, all assertions pass on the following program:有了这个假设,所有断言都通过以下程序:

subnormal.c次正常.c

#if __STDC_VERSION__ < 201112L
#error C11 required
#endif

#ifndef __STDC_IEC_559__
#error IEEE 754 not implemented
#endif

#include <assert.h>
#include <float.h> /* FLT_HAS_SUBNORM */
#include <inttypes.h>
#include <math.h> /* isnormal */
#include <stdlib.h>
#include <stdio.h>

#if FLT_HAS_SUBNORM != 1
#error float does not have subnormal numbers
#endif

typedef struct {
    uint32_t sign, exponent, fraction;
} Float32;

Float32 float32_from_float(float f) {
    uint32_t bytes;
    Float32 float32;
    bytes = *(uint32_t*)&f;
    float32.fraction = bytes & 0x007FFFFF;
    bytes >>= 23;
    float32.exponent = bytes & 0x000000FF;
    bytes >>= 8;
    float32.sign = bytes & 0x000000001;
    bytes >>= 1;
    return float32;
}

float float_from_bytes(
    uint32_t sign,
    uint32_t exponent,
    uint32_t fraction
) {
    uint32_t bytes;
    bytes = 0;
    bytes |= sign;
    bytes <<= 8;
    bytes |= exponent;
    bytes <<= 23;
    bytes |= fraction;
    return *(float*)&bytes;
}

int float32_equal(
    float f,
    uint32_t sign,
    uint32_t exponent,
    uint32_t fraction
) {
    Float32 float32;
    float32 = float32_from_float(f);
    return
        (float32.sign     == sign) &&
        (float32.exponent == exponent) &&
        (float32.fraction == fraction)
    ;
}

void float32_print(float f) {
    Float32 float32 = float32_from_float(f);
    printf(
        "%" PRIu32 " %" PRIu32 " %" PRIu32 "\n",
        float32.sign, float32.exponent, float32.fraction
    );
}

int main(void) {
    /* Basic examples. */
    assert(float32_equal(0.5f, 0, 126, 0));
    assert(float32_equal(1.0f, 0, 127, 0));
    assert(float32_equal(2.0f, 0, 128, 0));
    assert(isnormal(0.5f));
    assert(isnormal(1.0f));
    assert(isnormal(2.0f));

    /* Quick review of C hex floating point literals. */
    assert(0.5f == 0x1.0p-1f);
    assert(1.0f == 0x1.0p0f);
    assert(2.0f == 0x1.0p1f);

    /* Sign bit. */
    assert(float32_equal(-0.5f, 1, 126, 0));
    assert(float32_equal(-1.0f, 1, 127, 0));
    assert(float32_equal(-2.0f, 1, 128, 0));
    assert(isnormal(-0.5f));
    assert(isnormal(-1.0f));
    assert(isnormal(-2.0f));

    /* The special case of 0.0 and -0.0. */
    assert(float32_equal( 0.0f, 0, 0, 0));
    assert(float32_equal(-0.0f, 1, 0, 0));
    assert(!isnormal( 0.0f));
    assert(!isnormal(-0.0f));
    assert(0.0f == -0.0f);

    /* ANSI C defines FLT_MIN as the smallest non-subnormal number. */
    assert(FLT_MIN == 0x1.0p-126f);
    assert(float32_equal(FLT_MIN, 0, 1, 0));
    assert(isnormal(FLT_MIN));

    /* The largest subnormal number. */
    float largest_subnormal = float_from_bytes(0, 0, 0x7FFFFF);
    assert(largest_subnormal == 0x0.FFFFFEp-126f);
    assert(largest_subnormal < FLT_MIN);
    assert(!isnormal(largest_subnormal));

    /* The smallest non-zero subnormal number. */
    float smallest_subnormal = float_from_bytes(0, 0, 1);
    assert(smallest_subnormal == 0x0.000002p-126f);
    assert(0.0f < smallest_subnormal);
    assert(!isnormal(smallest_subnormal));

    return EXIT_SUCCESS;
}

GitHub upstream . GitHub 上游.

Compile and run with:编译并运行:

gcc -ggdb3 -O0 -std=c11 -Wall -Wextra -Wpedantic -Werror -o subnormal.out subnormal.c
./subnormal.out

C++ C++

In addition to exposing all of C's APIs, C++ also exposes some extra subnormal related functionality that is not as readily available in C in <limits> , eg:除了公开所有 C 的 API 之外,C++ 还公开了一些额外的次规范相关功能,这些功能在 C 中的<limits>并不容易获得,例如:

  • denorm_min : Returns the minimum positive subnormal value of the type T denorm_min :返回类型 T 的最小正次正规值

In C++ the whole API is templated for each floating point type, and is much nicer.在 C++ 中,整个 API 都是针对每个浮点类型进行模板化的,而且要好得多。

Implementations实现

x86_64 and ARMv8 implemens IEEE 754 directly on hardware, which the C code translates to. x86_64 和 ARMv8 直接在硬件上实现 IEEE 754,C 代码将转换为该硬件。

Subnormals seem to be less fast than normals in certain implementations: Why does changing 0.1f to 0 slow down performance by 10x?在某些实现中,次正规数似乎不如法线快: 为什么将 0.1f 更改为 0 会使性能降低 10 倍? This is mentioned in the ARM manual, see the "ARMv8 details" section of this answer. ARM 手册中提到了这一点,请参阅此答案的“ARMv8 详细信息”部分。

ARMv8 details ARMv8 详细信息

ARM Architecture Reference Manual ARMv8 DDI 0487C.a manual A1.5.4 "Flush-to-zero" describes a configurable mode where subnormals are rounded to zero to improve performance: ARM 体系结构参考手册 ARMv8 DDI 0487C.a 手册A1.5.4“Flush-to-zero”描述了一种可配置模式,其中次法线四舍五入为零以提高性能:

The performance of floating-point processing can be reduced when doing calculations involving denormalized numbers and Underflow exceptions.在进行涉及非规范化数字和下溢异常的计算时,可能会降低浮点处理的性能。 In many algorithms, this performance can be recovered, without significantly affecting the accuracy of the final result, by replacing the denormalized operands and intermediate results with zeros.在许多算法中,通过用零替换非规范化操作数和中间结果,可以恢复这种性能,而不会显着影响最终结果的准确性。 To permit this optimization, ARM floating-point implementations allow a Flush-to-zero mode to be used for different floating-point formats as follows:为实现这种优化,ARM 浮点实现允许将刷新归零模式用于不同的浮点格式,如下所示:

  • For AArch64:对于 AArch64:

    • If FPCR.FZ==1 , then Flush-to-Zero mode is used for all Single-Precision and Double-Precision inputs and outputs of all instructions.如果FPCR.FZ==1 ,则清零模式用于所有指令的所有单精度和双精度输入和输出。

    • If FPCR.FZ16==1 , then Flush-to-Zero mode is used for all Half-Precision inputs and outputs of floating-point instructions, other than:—Conversions between Half-Precision and Single-Precision numbers.—Conversions between Half-Precision and Double-Precision numbers.如果FPCR.FZ16==1 ,则清零模式用于浮点指令的所有半精度输入和输出,除了:—半精度数和单精度数之间的转换。—半精度数之间的转换- 精度和双精度数。

A1.5.2 "Floating-point standards, and terminology" Table A1-3 "Floating-point terminology" confirms that subnormals and denormals are synonyms: A1.5.2 “浮点标准和术语” 表 A1-3 “浮点术语”确认次规范和非规范是同义词:

 This manual IEEE 754-2008 ------------------------- ------------- [...] Denormal, or denormalized Subnormal

C5.2.7 "FPCR, Floating-point Control Register" describes how ARMv8 can optionally raise exceptions or set a flag bits whenever the input of a floating point operation is subnormal: C5.2.7“FPCR,浮点控制寄存器”描述了 ARMv8 如何在浮点运算的输入低于正常时可选地引发异常或设置标志位:

FPCR.IDE, bit [15] Input Denormal floating-point exception trap enable. FPCR.IDE,位 [15] 输入异常浮点异常陷阱使能。 Possible values are:可能的值为:

  • 0b0 Untrapped exception handling selected. 0b0 选择了未捕获的异常处理。 If the floating-point exception occurs then the FPSR.IDC bit is set to 1.如果发生浮点异常,则 FPSR.IDC 位设置为 1。

  • 0b1 Trapped exception handling selected. 0b1 已选择捕获异常处理。 If the floating-point exception occurs, the PE does not update the FPSR.IDC bit.如果发生浮点异常,PE 不会更新 FPSR.IDC 位。 The trap handling software can decide whether to set the FPSR.IDC bit to 1.陷阱处理软件可以决定是否将 FPSR.IDC 位设置为 1。

D12.2.88 "MVFR1_EL1, AArch32 Media and VFP Feature Register 1" shows that denormal support is completely optional in fact, and offers a bit to detect if there is support: D12.2.88 "MVFR1_EL1, AArch32 Media and VFP Feature Register 1" 显示非规范支持实际上是完全可选的,并提供了一点来检测是否有支持:

FPFtZ, bits [3:0] FPFtZ,位 [3:0]

Flush to Zero mode.清零模式。 Indicates whether the floating-point implementation provides support only for the Flush-to-Zero mode of operation.指示浮点实现是否仅提供对 Flush-to-Zero 操作模式的支持。 Defined values are:定义的值是:

  • 0b0000 Not implemented, or hardware supports only the Flush-to-Zero mode of operation. 0b0000 未实现,或硬件仅支持清零操作模式。

  • 0b0001 Hardware supports full denormalized number arithmetic. 0b0001 硬件支持完全非规范化数字算法。

All other values are reserved.保留所有其他值。

In ARMv8-A, the permitted values are 0b0000 and 0b0001.在 ARMv8-A 中,允许的值为 0b0000 和 0b0001。

This suggests that when subnormals are not implemented, implementations just revert to flush-to-zero.这表明当未实现次正规化时,实现只是恢复到清零。

Infinity and NaN无穷大和 NaN

Curious?好奇的? I've written some things at:我写了一些东西:

How subnormals improve computations次正规如何改进计算

TODO: further understand more precisely how that jump makes calculation results worse/how subnormals improve calculation results. TODO:进一步更准确地了解跳跃如何使计算结果更糟/次正规如何改善计算结果。

Actual history实际历史

An Interview with the Old Man of Floating-Point by Charles Severance .(1998) is a short real world historical overview in the form of an interview with William Kahan was suggested by John Coleman in the comments.查尔斯·塞弗伦斯( Charles Severance ) 对浮点老人的采访。(1998) 是一个简短的现实世界历史概述,约翰·科尔曼 (John Coleman) 在评论中建议采用对威廉·卡汉( William Kahan ) 的采访形式。

In the IEEE754 standard, floating point numbers are represented as binary scientific notation, x = M × 2 e .在 IEEE754 标准中,浮点数表示为二进制科学记数法, x = M × 2 e Here M is the mantissa and e is the exponent .这里M尾数e指数 Mathematically, you can always choose the exponent so that 1 ≤ M < 2.* However, since in the computer representation the exponent can only have a finite range, there are some numbers which are bigger than zero, but smaller than 1.0 × 2 e min .在数学上,你总是可以选择指数,使得 1 ≤ M < 2.* 但是,由于在计算机表示中指数只能有一个有限的范围,所以有些数字大于零但小于 1.0 × 2 e分钟Those numbers are the subnormals or denormals .这些数字是subnormalsdenormals

Practically, the mantissa is stored without the leading 1, since there is always a leading 1, except for subnormal numbers (and zero).实际上,尾数的存储没有前导 1,因为总是有前导 1,除了次正规数(和零)。 Thus the interpretation is that if the exponent is non-minimal, there is an implicit leading 1, and if the exponent is minimal, there isn't, and the number is subnormal.因此解释是,如果指数是非最小的,则有一个隐含的前导 1,如果指数最小,则没有,并且数字是次正规的。

*) More generally, 1 ≤ M < B for any base- B scientific notation. *)更一般地,1≤中号<B对于任何碱基科学记数法。

From http://blogs.oracle.com/d/entry/subnormal_numbers :来自http://blogs.oracle.com/d/entry/subnormal_numbers

There are potentially multiple ways of representing the same number, using decimal as an example, the number 0.1 could be represented as 1*10 -1 or 0.1*10 0 or even 0.01 * 10. The standard dictates that the numbers are always stored with the first bit as a one.可能有多种表示相同数字的方式,以十进制为例,数字 0.1 可以表示为 1*10 -1或 0.1*10 0甚至 0.01 * 10。标准规定数字始终以第一位作为一个。 In decimal that corresponds to the 1*10-1 example.对应于 1*10-1 示例的十进制数。

Now suppose that the lowest exponent that can be represented is -100.现在假设可以表示的最低指数是 -100。 So the smallest number that can be represented in normal form is 1*10 -100 .所以可以用标准形式表示的最小数字是 1*10 -100 However, if we relax the constraint that the leading bit be a one, then we can actually represent smaller numbers in the same space.然而,如果我们放宽前导位为 1 的约束,那么我们实际上可以在相同的空间中表示更小的数字。 Taking a decimal example we could represent 0.1*10 -100 .以十进制为例,我们可以表示 0.1*10 -100 This is called a subnormal number.这称为次正规数。 The purpose of having subnormal numbers is to smooth the gap between the smallest normal number and zero.使用次正规数的目的是平滑最小正规数和零之间的差距。

It is very important to realise that subnormal numbers are represented with less precision than normal numbers.认识到次正规数的表示精度低于正规数是非常重要的。 In fact, they are trading reduced precision for their smaller size.事实上,他们正在用较小的尺寸换取降低的精度。 Hence calculations that use subnormal numbers are not going to have the same precision as calculations on normal numbers.因此,使用次正规数的计算将不会具有与正规数计算相同的精度。 So an application which does significant computation on subnormal numbers is probably worth investigating to see if rescaling (ie multiplying the numbers by some scaling factor) would yield fewer subnormals, and more accurate results.因此,对次正规数进行大量计算的应用程序可能值得研究,以查看重新缩放(即,将数字乘以某个比例因子)是否会产生更少的次正规数和更准确的结果。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM