I need to calculate bounds for rectangleF
type
For some reason the casting from double
to float
is not evaluated precisely as it should be
This is an example of such calculate
float MinX = 0f, MaxX = 0f;
float MinY = 0f, MaxY = 0f;
float BoundsWidth = 0.2f;
float BoundsHeight = 0.1f;
double BoundsY = 2333638.6551984739;
double BoundsX = 895.0999755859375;
MinX = (float)BoundsX;
MinY = (float)BoundsY;
var MaxX_Defect = BoundsX + BoundsWidth;
var MaxY_Defect = BoundsY + BoundsHeight;
MaxX = (float)(MaxX_Defect);
MaxY = (float)(MaxY_Defect);
When I'm trying to calculate the hight MaxY-MinY
its evaluated as 0
instead of 0.1f
How can I fix this?
The line float BoundsHeight = 0.1f;
converts.1 to the nearest value representable in float
, resulting in BoundsHeight
being 0.100000001490116119384765625
The line double BoundsY = 2,333,638.6551984739;
similarly converts to double
, setting BoundsY
to 2,333,638.6551984739489853382110595703125.
The line float MinY = BoundsY;
converts that to float
, setting MinY
to 2,333,638.75.
The line double MaxY_Defect = BoundsY + BoundsHeight;
computes using double
(I presume; I am not familiar with C# semantics), setting MaxY_Defect
to 2,333,638.7551984754391014575958251953125.
The line float MaxY = (float)(MaxY_Defect);
converts that to float
, setting MaxY
to 2,333,638.75.
Then we can see that MinY
and MaxY
have the same value, so of course MaxY-MinY
is zero.
Quite simply, float
does not have enough precision to distinguish between 2,333,638.6551984739489853382110595703125 and 2,333,638.7551984754391014575958251953125. At the scale of 2,333,638, the distance between adjacent representable numbers in the float
format is.25. This is because the format has 24 bits for the significand (the fraction portion of the floating-point representation). 2,333,638 is between 2 21 and 2 22 , so the exponent in its floating-point representation scales the significand to have bits representing values from 2 21 to 2 −2 (from 21 to −2, inclusive, is 24 positions). So changing the significand by 1 in its lowest bit changes the represented number by 2 −2 =.25.
Thus, when 2,333,638.655… and 2,333,638.755… are converted to float
, they have the same result, 2,333,638.75.
You cannot use float
to distinguish between coordinates or sizes that are this close at that magnitude. You can use double
or you might be able to translate the coordinates to be nearer the origin (so their magnitudes are smaller, putting them in a region where the float
resolution is finer).
As long as the final result is small, you could do the intermediate calculations using double
but still represent the final result well using float
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.