简体   繁体   English

使用浮点数进行最精确的线交叉纵坐标计算?

[英]Most accurate line intersection ordinate computation with floats?

I'm computing the ordinate y of a point on a line at a given abscissa x. 我正在计算给定横坐标x的直线上的点的纵坐标y。 The line is defined by its two end points coordinates (x0,y0)(x1,y1). 该线由其两个端点坐标(x0,y0)(x1,y1)定义。 End points coordinates are floats and the computation must be done in float precision for use in GPU. 端点坐标是浮点数,计算必须以浮点精度完成,以便在GPU中使用。

The maths, and thus the naive implementation, are trivial. 数学,以及天真的实现,都是微不足道的。

Let t = (x - x0)/(x1 - x0), then y = (1 - t) * y0 + t * y1 = y0 + t * (y1 - y0). 设t =(x-x0)/(x1-x0),则y =(1-t)* y0 + t * y1 = y0 + t *(y1-y0)。

The problem is when x1 - x0 is small. 问题是当x1 - x0很小时。 The result will introduce cancellation error. 结果将引入取消错误。 When combined with the one of x - x0, in the division I expect a significant error in t. 当与x - x0中的一个组合时,在分区中我期望t中存在显着误差。

The question is if there exist another way to determine y with a better accuracy ? 问题是,是否存在另一种方法来更准确地确定y?

ie should I compute (x - x0)*(y1 - y0) first, and divide by (x1 - x0) after ? 即我应该首先计算(x - x0)*(y1 - y0),然后除以(x1 - x0)吗?

The difference y1 - y0 will always be big. 差异y1 - y0总是很大。

To a large degree, your underlying problem is fundamental. 在很大程度上,你的根本问题是根本的。 When (x1-x0) is small, it means there are only a few bits in the mantissa of x1 and x0 which differ. 当(x1-x0)很小时,意味着x1和x0的尾数中只有几位不同。 And by extension, there are only a limted number of floats between x0 and x1. 而且,通过扩展,x0和x1之间只有一个有限数量的浮点数。 Eg if only the lower 4 bits of the mantissa differ, there are at most 14 values between them. 例如,如果只有尾数的低4位不同,则它们之间最多有14个值。

In your best algorithm, the t term represents these lower bits. 在您的最佳算法中, t term表示这些较低位。 And to continue or example, if x0 and x1 differ by 4 bits, then t can take on only 16 values either. 并且继续或示例,如果x0和x1相差4位,则t也可以仅占用16个值。 The calculation of these possible values is fairly robust. 这些可能值的计算相当稳健。 Whether you're calculating 3E0/14E0 or 3E-12/14E-12, the result is going to be close to the mathematical value of 3/14. 无论您是在计算3E0 / 14E0还是3E-12 / 14E-12,结果都将接近3/14的数学值。

Your formula has the additional advantage of having y0 <= y <= y1, since 0 <= t <= 1 由于0 <= t <= 1,因此您的公式具有y0 <= y <= y1的额外优势

(I'm assuming that you know enough about float representations, and therefore "(x1-x0) is small" really means "small, relative to the values of x1 and x0 themselves". A difference of 1E-1 is small when x0=1E3 but large if x0=1E-6 ) (我假设您对浮点表示有足够的​​了解,因此“(x1-x0)很小”实际上意味着“相对于x1和x0本身的值很小”。当x0时,1E-1的差异很小= 1E3但是如果x0 = 1E-6则很大)

You may have a look at Qt's "QLine" (if I remember it right) sources; 你可以看看Qt的“QLine”(如果我没记错的话)来源; they have implemented an intersection determination algorithm taken from one the "Graphics Gems" books (the reference must be in the code comments, the book was on EDonkey a couple of years ago), which, in turn, has some guarantees on applicability for a given screen resolution when calculations are performed with given bit-width (they use fixed-point arithmetics if I'm not wrong). 他们已经实现了一个交叉点确定算法,该算法来自一个“图形宝石”书籍(参考文献必须在代码注释中,这本书是在几年前的EDonkey上),反过来,它对一个适用性有一些保证。给定屏幕分辨率时,使用给定的位宽进行计算(如果我没有错,则使用定点算术)。

If you have the possibility to do it, you can introduce two cases in your computation, depending on abs(x1-x0) < abs(y1-y0). 如果你有可能这样做,你可以在计算中引入两种情况,这取决于abs(x1-x0)<abs(y1-y0)。 In the vertical case abs(x1-x0) < abs(y1-y0), compute x from y instead of y from x. 在垂直情况下,abs(x1-x0)<abs(y1-y0),从y计算x而不是从x计算y。

EDIT. 编辑。 Another possibility would be to obtain the result bit by bit using a variant of dichotomic search. 另一种可能性是使用二分法搜索的变体逐位获得结果。 This will be slower, but may improve the result in extreme cases. 这将会变慢,但在极端情况下可能会改善结果。

// Input is X
xmin = min(x0,x1);
xmax = max(x0,x1);
ymin = min(y0,y1);
ymax = max(y0,y1);
for (int i=0;i<20;i++) // get 20 bits in result
{
  xmid = (xmin+xmax)*0.5;
  ymid = (ymin+ymax)*0.5;
  if ( x < xmid ) { xmax = xmid; ymax = ymid; } // first half
  else { xmin = xmid; ymin = ymid; } // second half
}
// Output is some value in [ymin,ymax]
Y = ymin;

I have implemented a benchmark program to compare the effect of the different expression. 我已经实现了一个基准程序来比较不同表达式的效果。

I computed y using double precision and then compute y using single precision with different expressions. 我使用双精度计算y,然后使用具有不同表达式的单精度计算y。

Here are the expression tested: 以下是测试的表达式:

inline double getYDbl( double x, double x0, double y0, double x1, double y1 )
{
    double const t = (x - x0)/(x1 - x0);
    return y0 + t*(y1 - y0);
} 

inline float getYFlt1( float x, float x0, float y0, float x1, float y1 )
{
    double const t = (x - x0)/(x1 - x0);
    return y0 + t*(y1 - y0);
} 

inline float getYFlt2( float x, float x0, float y0, float x1, float y1 )
{
    double const t = (x - x0)*(y1 - y0);
    return y0 + t/(x1 - x0);
} 

inline float getYFlt3( float x, float x0, float y0, float x1, float y1 )
{
    double const t = (y1 - y0)/(x1 - x0);
    return y0 + t*(x - x0);
} 

inline float getYFlt4( float x, float x0, float y0, float x1, float y1 )
{
    double const t = (x1 - x0)/(y1 - y0);
    return y0 + (x - x0)/t;
} 

I computed the average and stdDev of the difference between the double precision result and single precision result. 我计算了双精度结果和单精度结果之间差异的平均值和stdDev。

The result is that there is none on the average over 1000 and 10K random value sets. 结果是平均超过1000和10K随机值集没有。 I used icc compiler with and without optimization as well as g++. 我使用icc编译器有和没有优化以及g ++。

Note that I had to use the isnan() function to filter out bogus values. 请注意,我必须使用isnan()函数来过滤掉伪造的值。 I suspect these result from underflow in the difference or division. 我怀疑这些结果来自于差异或分裂的下溢。

I don't know if the compilers rearrange the expression. 我不知道编译器是否重新排列了表达式。

Anyway, the conclusion from this test is that the above rearrangements of the expression have no effect on the computation precision. 无论如何,该测试的结论是表达式的上述重新排列对计算精度没有影响。 The error remains the same (on average). 错误保持不变(平均)。

If your source data is already a float then you already have fundamental inaccuracy. 如果您的源数据已经是浮点数,那么您已经存在基本的不准确性。

To explain further, imagine if you were doing this graphically. 为了进一步解释,想象一下你是否以图形方式进行此操作。 You have a 2D sheet of graph paper, and 2 point marked. 你有一张2D纸质方格纸,并标有2个点。

Case 1: Those points are very accurate, and have been marked with a very sharp pencil. 案例1:这些点非常准确,并且标有非常锋利的铅笔。 Its easy to draw the line joining them, and easy to then get y given x (or vice versa). 它很容易绘制连接它们的线,并且很容易得到给定的x(反之亦然)。

Case 2: These point have been marked with a big fat felt tip pen, like a bingo marker. 案例2:这些点标有一个大的毡尖笔,就像一个宾果标记。 Clearly the line you draw will be less accurate. 显然,您绘制的线条不太准确。 Do you go through the centre of the spots? 你经过这些景点的中心吗? The top edge? 顶边? The bottom edge? 底边? Top of one, bottom of the other? 顶部,另一个的底部? Clearly there are many different options. 显然,有许多不同的选择。 If the two dots are close to each other then the variation will be even greater. 如果两个点彼此接近,那么变化将更大。

Floats have a certain level of inaccuracy inherent in them, due to the way they represent numbers, ergo they correspond more to case 2 than case 1 (which one could suggest is the equivalent of using an arbitrary precision librray). 浮点数具有一定程度的不准确性,由于它们表示数字的方式,因此它们更多地与情况2相对应而不是情况1(人们可以建议相当于使用任意精度的librray)。 No algorithm in the world can compensate for that. 世界上没有任何算法可以弥补这一点。 Imprecise data in, Imprecise data out 不精确的数据,不精确的数据输出

Check if the distance between x0 and x1 is small, ie fabs(x1 - x0) < eps. 检查x0和x1之间的距离是否小,即fabs(x1 - x0)<eps。 Then the line is parallell to the y axis of the coordinate system, ie you can't calculuate the y values of that line depending on x. 然后该线与坐标系的y轴平行,即您不能根据 x来计算该线的y值。 You have infinite many y values and therefore you have to treat this case differently. 你有无数的y值,因此你必须以不同的方式对待这种情况。

How about computing something like: 如何计算如下:

t = sign * power2 ( sqrt (abs(x - x0))/ sqrt (abs(x1 - x0)))

The idea is to use a mathematical equivalent formula in which low (x1-x0) has less effect. 这个想法是使用数学等价公式,其中低(x1-x0)影响较小。 (not sure if the one I wrote matches this criteria) (不确定我写的那个是否符合这个标准)

As MSalters said, the problem is already in the original data. 正如MSalters所说,问题已经存在于原始数据中。

Interpolation / extrapolation requires the slope, which already has low accuracy in the given conditions (worst for very short line segments far away from the origin). 插值/外推需要斜率,在给定条件下已经具有低精度(对于远离原点的极短线段最差)。

Choice of algorithm canot regain this accuracy loss. 选择算法canot重新获得这种准确性损失。 My gut feeling is that the different evaluation order will not change things, as the error is introduced by the subtractions, not the devision. 我的直觉是,不同的评估顺序不会改变事物,因为错误是由减法引入的,而不是偏差。


Idea: 理念:
If you have more accurate data when the lines are generated, you can change the representation from ((x0, y0), (x1, y1)) to (x0,y0, angle, length). 如果在生成线条时有更准确的数据,则可以将表示从((x0,y0),(x1,y1))更改为(x0,y0,角度,长度)。 You could store angle or slope, slope has a pole, but angle requires trig functions... ugly. 你可以存储角度或斜率,斜率有一个极点,但角度需要触发功能......丑陋。

Of course that won't work if you need the end point frequently, and you have so many lines that you can't store additional data, I have no idea. 当然,如果您经常需要终点,那么这将无效,并且您有很多行无法存储其他数据,我不知道。 But maybe there is another representation that works well for your needs. 但也许有另一种表现形式可以很好地满足您的需求。

doubles have enough resolution in most situations, but that would double the working set too. 在大多数情况下,双打有足够的分辨率,但这也会使工作集加倍。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM