在 C 和 C++ 中使用浮点转换与浮点后缀有什么区别？

Question

Is there a difference between this (using floating point literal suffixes ):这之间有区别吗（使用浮点文字后缀）：

float MY_FLOAT = 3.14159265358979323846264338328f; // f suffix
double MY_DOUBLE = 3.14159265358979323846264338328; // no suffix 
long double MY_LONG_DOUBLE = 3.14159265358979323846264338328L; // L suffix

vs this (using floating point casts): vs this（使用浮点转换）：

float MY_FLOAT = (float)3.14159265358979323846264338328;
double MY_DOUBLE = (double)3.14159265358979323846264338328;
long double MY_LONG_DOUBLE = (long double)3.14159265358979323846264338328;

in C and C++?在 C 和 C++ 中？

Note: the same would go for function calls:注意：对于 function 调用，go 相同：

void my_func(long double value);

my_func(3.14159265358979323846264338328L);
// vs
my_func((long double)3.14159265358979323846264338328);
// etc.

Related:有关的：

Answer 1

The default is double .默认值为double 。 Assuming IEEE754 floating point, double is a strict superset of float , and thus you will never lose precision by not specifying f .假设 IEEE754 浮点， double是float的严格超集，因此不指定f永远不会丢失精度。 EDIT: this is only true when specifying values that can be represented by float .编辑：这仅在指定可以由float表示的值时才成立。 If rounding occurs this might not be strictly true due to having rounding twice, see Eric Postpischil's answer .如果发生舍入，由于舍入两次，这可能不是严格正确的，请参阅 Eric Postpischil 的回答。 So you should also use the f suffix for floats.因此，您还应该使用f后缀作为浮点数。

This example is also problematic:这个例子也有问题：

long double MY_LONG_DOUBLE = (long double)3.14159265358979323846264338328;

This first gives a double constant which is then converted to long double .这首先给出一个double常量，然后将其转换为long double 。 But because you started with a double you have already lost precision that will never come back.但是因为你从一个double开始，你已经失去了永远不会回来的精度。 Therefore, if you want to use full precision in long double constants you must use the L suffix:因此，如果要在long double常量中使用全精度，则必须使用L后缀：

long double MY_LONG_DOUBLE = 3.14159265358979323846264338328L; // L suffix

Answer 2

There is a difference between using a suffix and a cast;使用后缀和强制转换是有区别的； 8388608.5000000009f and (float) 8388608.5000000009 have different values in common C implementations. 8388608.5000000009f和(float) 8388608.5000000009在常见的 C 实现中具有不同的值。 This code:这段代码：

#include <stdio.h>

int main(void)
{
    float x =         8388608.5000000009f;
    float y = (float) 8388608.5000000009;
    printf("%.9g - %.9g = %.9g.\n", x, y, x-y);
}

prints “8388609 - 8388608 = 1.”打印“8388609 - 8388608 = 1”。 in Apple Clang 11.0 and other implementations that use correct rounding with IEEE-754 binary32 for float and binary64 for double .在 Apple Clang 11.0 和其他使用 IEEE-754 binary32 for float和 binary64 for double的正确舍入的实现中。 (The C standard permits implementations to use methods other than IEEE-754 correct rounding, so other C implementations may have different results.) （C 标准允许实现使用 IEEE-754 正确舍入以外的方法，因此其他 C 实现可能有不同的结果。）

The reason is that (float) 8388608.5000000009 contains two rounding operations.原因是(float) 8388608.5000000009包含两个舍入操作。 With the suffix, 8388608.5000000009f is converted directly to float , so the portion that must be discarded in order to fit in a float , .5000000009, is directly examined in order to see whether it is greater than.5 or not.使用后缀8388608.5000000009f直接转换为float ，因此为了适合float必须丢弃的部分 0.5000000009 会被直接检查以查看它是否大于.5。 It is, so the result is rounded up to the next representable value, 8388609.是的，所以结果向上舍入到下一个可表示的值 8388609。

Without the suffix, 8388608.5000000009 is first converted to double .没有后缀， 8388608.5000000009首先被转换为double 。 When the portion that must be discarded, .0000000009, is considered, it is found to be less than ½ the low bit at the point of truncation.当考虑必须丢弃的部分 0.0000000009 时，发现它小于截断点的低位 ½。 (The value of the low bit there is.00000000186264514923095703125, and half of it is.000000000931322574615478515625.) So the result is rounded down, and we have 8388608.5 as a double . （那里低位的值是.00000000186264514923095703125，一半是.000000000931322574615478515625。）所以结果向下舍入，我们得到8388608.5作为double精度数。 When the cast rounds this to float , the portion that must be discarded is.5, which is exactly halfway between the representable numbers 8388608 and 8388609. The rule for breaking ties rounds it to the value with the even low bit, 8388608.当强制转换将此四舍五入为float时，必须丢弃的部分是.5，正好在可表示的数字 8388608 和 8388609 之间。打破平局的规则将其四舍五入为具有偶数低位 8388608 的值。

(Another example is “7.038531e-26”; (float) 7.038531e-26 is not equal to 7.038531e-26f . This is the only such numeral with fewer than eight significant digits when float is binary32 and double is binary64, except of course “-7.038531e-26”.) （另一个例子是“7.038531e-26”； (float) 7.038531e-26不等于7.038531e-26f 。当float是 binary32 并且double是 binary64 时，这是唯一一个有效数字少于八位的数字，除了课程“-7.038531e-26”。）

Answer 3

While you do not lose precision if you omit the f in a float constant, there can be surprises in so doing.虽然在浮点常量中省略 f 不会丢失精度，但这样做可能会让人感到意外。 Consider this:考虑一下：

#include    <stdio.h>

#define DCN 0.1
#define FCN 0.1f
int main( void)
{
float   f = DCN;
    printf( "DCN\t%s\n", f > DCN ? "more" : "not-more");
float   g = FCN;
    printf( "FCN\t%s\n", g > FCN ? "more" : "not-more");
    return 0;
}

This (compiled with gcc 9.1.1) produces the output这（使用 gcc 9.1.1 编译）产生 output

DCN more
FCN not-more

The explanation is that in f > DCN the compiler takes DCN to have type double and so promotes f to a double, and解释是在 f > DCN 中，编译器将 DCN 设为 double 类型，因此将 f 提升为 double，并且

(double)(float)0.1 > 0.1

Personally on the (rare) occasions when I need float constants, I always use a 'f' suffix.就我个人而言，在我需要浮点常量的（罕见）场合，我总是使用“f”后缀。

在 C 和 C++ 中使用浮点转换与浮点后缀有什么区别？

问题描述

Related:有关的：

3 个解决方案

解决方案1
5 已采纳 2020-12-04 03:12:04

解决方案2
5 2020-12-04 11:30:37

解决方案3
0 2020-12-04 11:39:31

在 C 和 C++ 中使用浮点转换与浮点后缀有什么区别？

问题描述

Related:有关的：

3 个解决方案

解决方案1 5 已采纳 2020-12-04 03:12:04

解决方案2 5 2020-12-04 11:30:37

解决方案3 0 2020-12-04 11:39:31

解决方案1
5 已采纳 2020-12-04 03:12:04

解决方案2
5 2020-12-04 11:30:37

解决方案3
0 2020-12-04 11:39:31