[英]How to optimize interpolation of large set of scattered points?
I am currently working with a set of coordinate points (longitude, latitude, about 60000 of them) and the temperature at that location. 我目前正在处理一组坐标点(经度,纬度,其中约60000个)和该位置的温度。 I need to do a interpolation on them to compute the values at some points with unknown temperature as to map certain regions.
我需要对它们进行插值,以计算温度未知的某些点的值,以映射某些区域。 As to respect the influence that the points have between them I have converted every (long, lat) point to a unit sphere point (x, y, z).
关于尊重点之间的影响,我已经将每个(长,纬)点转换为一个单位球体点(x,y,z)。 I have started applying the generalized multidimension Shepard interpolation from "Numerical recipes 3rd Edition":
我已经开始应用“数字食谱第三版”中的广义多维Shepard插值法:
Doub interp(VecDoub_I &pt)
{
Doub r, w, sum=0., sumw=0.;
if (pt.size() != dim)
throw("RBF_interp bad pt size");
for (Int i=0;i<n;i++)
{
if ((r=rad(&pt[0],&pts[i][0])) == 0.)
return vals[i];
sum += (w = pow(r,pneg));
sumw += w*vals[i];
}
return sumw/sum;
}
Doub rad(const Doub *p1, const Doub *p2)
{
Doub sum = 0.;
for (Int i=0;i<dim;i++)
sum += SQR(p1[i]-p2[i]);
return sqrt(sum);
}
As you can see, for the interpolation of one point, the algorithm computes the distance of that point to each of the other points and taking it as a weight in the final value.
如您所见,对于一个点的插值,算法将计算该点到其他每个点的距离,并将其作为最终值的权重。 Even though this algorithm works it is much too slow compared to what I need since I will be computing a lot of points to map a grid of a certain region.
即使此算法有效,但与我所需的算法相比,它仍然太慢了,因为我将计算很多点以映射特定区域的网格。 One way of optimizing this is that I could leave out the points than are beyond a certain radius, but would pose a problem for areas with few or no points.
一种优化方法是,我可以忽略超出一定半径的点,但对于点很少或没有点的区域会造成问题。 Another thing would be to reduce the computing of the distance between each 2 points by only computing once a Look-up Table and storing the distances.
另一件事是通过仅计算一次查找表并存储距离来减少每两个点之间距离的计算。 The problem with this is that it is impossible to store such a large matrix (60000 x 60000).
问题是不可能存储这么大的矩阵(60000 x 60000)。 The grid of temperatures that is obtained, will be used to compute contours for different temperature values.
所获得的温度网格将用于计算不同温度值的轮廓。 If anyone knows a way to optimize this algorithm or maybe help with a better one, I will appreciate it.
如果有人知道一种优化该算法的方法,或者可能会寻求更好的算法,我将不胜感激。
Radial basis functions with infinite support is probably not what you want to be using if you have a large number of data points and will be taking a large number of interpolation values. 如果您有大量的数据点并且将使用大量的插值,那么无限支持的径向基函数可能不是您想要的。
There are variants that use N nearest neighbours and finite support to reduce the number of points that must be considered for each interpolation value. 有一些变体使用N个最近邻和有限支持来减少每个插值必须考虑的点数。 A variant of this can be found in the first solution mentioned here Inverse Distance Weighted (IDW) Interpolation with Python .
可以在此处提到的第一个解决方案中找到此方法的变体,该方法是Python的反距离加权(IDW)插值 。 (though I have a nagging suspicion that this implementation can be discontinuous under certain conditions - there are certainly variants that are fine)
(尽管我有点怀疑,这种实现在某些情况下可能是不连续的-肯定有很好的变体)
Your look-up table doesn't have to store every point in the 60k square, only those once which are used repeatedly. 您的查询表不必存储60k平方中的每个点,只需存储一次即可重复使用的点。 You can map any coordinate
x
to int(x*resolution)
to improve the hit rate by lowering the resolution. 您可以将任何坐标
x
映射到int(x*resolution)
以通过降低分辨率来提高命中率。
A similar lookup table for the power function might also help. 幂函数的类似查找表也可能会有所帮助。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.