What does the precision of float, double or long double mean in C++?

Question

For example in this table they say 7 digits to float, and 15 digits to double: https://docs.microsoft.com/en-us/cpp/cpp/data-type-ranges

But this statement returns 9:

std::cout << std::numeric_limits<float>::max_digits10 << '\n';

Add this returns 17:

std::cout << std::numeric_limits<double>::max_digits10 << '\n';

Maybe we should substract the + or the - sign, and the dot from the number of digits? Here they say that this precision is guaranteed digits remain after text -> number -> text conversion: https://en.cppreference.com/w/cpp/types/numeric_limits/max_digits10

But when i do this, the precision is always 7, in the case of double also:

#include <iostream>
#include <string>
int main() {
  double d_example = 0.123456789;
  std::string str_example = std::to_string(d_example);
  d_example = stod(str_example);
  std::cout << str_example << '\n';
  std::cout << d_example << '\n';
}

Answer 1

These numbers are characteristic of different operations.

Maximal decimal digits ( numeric_limits<>::max_digits10 ): If a floating-point number is converted to a string with at least that many decimal digits (not counting non-digit characters such as period) and then back to a number, the resulting number is guaranteed to be equal to the original.

Decimal digits ( numeric_limits<>::digits10 ): If a string with at most that many decimal digits (not counting non-digit characters such as period) is converted to a floating-point number and then back to a string with the same number of decimal digits, the resulting string is guaranteed to represent the same number as the original string.

The decimal digits value for a single-precision float is actually 6, not 7. It is sometimes informally given as 7 because the test would succeed for most 7-digits strings.

Answer 2

On the max_digits10 page you linked , there's an example of bumping a float value along by the smallest representable increment, and displaying it using std::setprecision and << for streaming, with output including:

   max_digits10 is 9 digits
submax_digits10 is 8 digits
 
[...]
 
   max_digits10: 10.0000095
submax_digits10: 10.00001
 
   max_digits10: 10.0000105
submax_digits10: 10.00001

There, you see two float values that - if displayed using max_digits10 precision - are rounded to 10.0000095 and 10.0000105. If you used one less digit to display them (which is what the next line of output shows, just throwing away the trailing 0 for brevity), you'd get the same 10.00001[0] text for both float values, and therefore be unable to recreate the original float values if you streamed the text back to a float variable. That's the significance of max_digits10 - this reversible serialisation - if you use that number of digits.

Your code - however - usesstd::to_string , which uses a default precision that's less than max_digits10 . to_string is intended to yield a textual representation that's accurate enough for many purposes, but not excessively and annoyingly verbose.

Answer 3

Labeling float or double as having 7 or 15 digits is a rough estimate at best and should not be used as any sort of numerical analysis for the precision.

To evaluate precision and numerical effects, you should always consider the float and double types to be binary numerals with 24 and 53 bits of precision, because that is how they are actually represented. Binary and digital representations are incommensurate in various ways, so trying to understand or analyze the behavior is decimal makes it hard to understand the binary effects.

The numbers you are looking at std::numeric_limits< Type >::max_digits10 , which are 9 and 17 for the typical float and double formats, are not intended to be measures of precision. They are essentially meant to solve this problem:

I need to write a floating-point number in decimal to a file and later read the decimal numeral from that file back into a floating-point number. How many decimal digits do I need to write to guarantee that reading it back will restore the original number?

It is not a measure of the accuracy of the floating-point format. It includes some “extra” digits that are caused by the discrepancy between binary and decimal, to allow for the fact that they are “offset” in a certain sense and do not line up. It is as if you have an oddly shaped box you are trying to put into a rectangular box—you need a box that actually has more area than the oddly shaped box because the fit is not perfect. Similarly, the max_digits10 specifies more decimal digits than the actual information content of the floating-point type. So it is not a correct measure of the precision of the type.

The parameters that give you information about the precision of the type are std::numeric_limits< Type >::radix and std::numeric_limits< Type >::digits . The first is the numeric base used for floating-point, which is 2 in common C++ implementations. The second is the number of digits the floating-point type has. Those are the actual digits used in the floating-point format; its significand is a numeral formed of digits base- radix digits. For example, for common float and double types, radix is 2, and digits is 24 or 53, so they use 24 base-two digits and 53 base-two digits, respectively.

What does the precision of float, double or long double mean in C++?

Question

3 answers

solution1
3 2021-04-07 13:56:19

solution2
1 2021-04-07 13:32:33

solution3
0 2021-04-08 23:58:11

What does the precision of float, double or long double mean in C++?

Question

3 answers

solution1 3 2021-04-07 13:56:19

solution2 1 2021-04-07 13:32:33

solution3 0 2021-04-08 23:58:11

solution1
3 2021-04-07 13:56:19

solution2
1 2021-04-07 13:32:33

solution3
0 2021-04-08 23:58:11