计算几个函数的平均函数

Question

I have several ordered List of X/Y Pairs and I want to calculate a ordered List of X/Y Pairs representing the average of these Lists. 我有几个有序的X / Y对列表，我想计算一个有序的X / Y对列表，代表这些列表的平均值。

All these Lists (including the "average list") will then be drawn onto a chart (see example picture below). 所有这些列表（包括“平均列表”）将被绘制到图表上（参见下面的示例图片）。

I have several problems: 我有几个问题：

The different lists don't have the same amount of values 不同的列表没有相同数量的值
The X and Y values can increase and decrease and increase (and so on) (see example picture below) X和Y值可以增加，减少和增加（等等）（参见下面的示例图片）

I need to implement this in C#, altought I guess that's not really important for the algorithm itself. 我需要在C＃中实现这一点，我想这对算法本身并不重要。

线条的例子

Sorry, that I can't explain my problem in a more formal or mathematical way. 对不起，我无法以更正式或数学的方式解释我的问题。

EDIT: I replaced the term "function" with "List of X/Y Pairs" which is less confusing. 编辑：我用“X / Y对列表”替换术语“功能”，这不那么令人困惑。

Answer 1

I'll use a metaphor of your functions being cars racing down a curvy racetrack, where you want to extract the center-line of the track given the cars' positions. 我将使用你的功能的比喻，即汽车在弯曲的赛道上奔跑，在那里你想要根据汽车的位置提取赛道的中心线。 Each car's position can be described as a function of time: 每辆车的位置都可以描述为时间的函数：

p1(t) = (x1(t), y1(t))
p2(t) = (x2(t), y2(t))
p3(t) = (x3(t), y3(t))

The crucial problem is that the cars are racing at different speeds , which means that p1(10) could be twice as far down the race track as p2(10) . 关键问题是赛车以不同的速度比赛 ，这意味着p1(10)可能是赛道的两倍，而p2(10) 。 If you took a naive average of these two points, and there was a sharp curve in the track between the cars, the average may be far from the track. 如果你对这两个点采取了天真的平均值，并且赛车之间的轨道有一条尖锐的曲线，那么平均值可能远离赛道。

If you could just transform your functions to no longer be a function of time, but a function of the distance along the track , then you would be able to do what you want. 如果你可以将你的功能转换为不再是时间的函数，而是沿着轨道的距离函数，那么你就可以做你想要的。

One way you could do this would be to choose the slowest car (ie, the one with the greatest number of samples). 你可以做到这一点的一种方法是选择最慢的汽车（即样品数量最多的汽车）。 Then, for each sample of the slowest car's position, look at all of the other cars' paths, find the two closest points, and choose the point on the interpolated line which is closest to the slowest car's position. 然后，对于最慢车辆位置的每个样本，查看所有其他车辆的路径，找到两个最近的点，并选择插值线上最接近最慢车辆位置的点。 Then average these points together. 然后将这些点平均在一起。 Once you do this for all of the slow car's samples, you have an average path. 对所有慢车样本执行此操作后，您将获得平均路径。

I'm assuming that all of the cars start and end in roughly the same places; 我假设所有的汽车都在大致相同的地方开始和结束; if any of the cars just race a small portion of the track, you will need to add some more logic to detect that. 如果任何一辆赛车只占赛道的一小部分，你需要增加一些逻辑来检测它。

A possible improvement (for both performance and accuracy), is to keep track of the most recent sample you are using for each car and the speed of each car (the relative sampling rate). 可能的改进（性能和准确性）是跟踪每辆车最近使用的样品和每辆车的速度（相对采样率）。 For your slowest car, it would be a simple map: 1 => 1, 2 => 2, 3 => 3, ... For the other cars, though, it could be more like: 1 => 0.3, 2 => 0.7, 3 => 1.6 (fractional values are due to interpolation). 对于你最慢的车，它将是一个简单的地图：1 => 1,2 => 2,3 => 3，......对于其他车，它可能更像：1 => 0.3,2 = > 0.7,3 => 1.6（小数值归因于插值）。 The speed would be the inverse of the change in sample number (eg, the slow car would have speed 1, and the other car would have speed 1/(1.6-0.7)=1.11). 速度将是样本数量变化的倒数（例如，慢车将具有速度1，而另一辆车具有速度1 /（1.6-0.7）= 1.11）。 You could then ensure that you don't accidentally backtrack on any of the cars. 然后，您可以确保不会意外地回溯任何车辆。 You could also improve the calculation speed because you don't have to search through the whole set of all points on each path; 您还可以提高计算速度，因为您不必搜索每条路径上的所有点的整个集合; instead, you can assume that the next sample will be somewhere close to the current sample plus 1/speed. 相反，您可以假设下一个样本将接近当前样本加1 /速度。

Answer 2

I would use the method Justin proposes, with one adjustment. 我会使用贾斯汀提出的方法，进行一次调整。 He suggests using a mappingtable with fractional indices, though I would suggest integer indices. 他建议使用带有小数指数的映射表，但我建议使用整数指数。 This might sound a little mathematical, but it's no shame to have to read the following twice(I'd have to too). 这可能听起来有点数学，但是必须阅读以下两次并不是一件好事（我也必须这样做）。 Suppose the point at index i in a list of pairs A has searched for the closest points in another list B, and that closest point is at index j. 假设对A列表中的索引i处的点已搜索另一列表B中的最近点，并且该最近点位于索引j处。 To find the closest point in B to A[i+1] you should only consider points in B with an index equal to or larger than j. 要找到B中最接近A [i + 1]的点，您应该只考虑B中指数等于或大于j的点。 It will probably by j + 1, but could be j or j + 2, j + 3 etc, but never below j. 它可能是j + 1，但可能是j或j + 2，j + 3等，但从不低于j。 Even if the point closest to A[i+1] has an index smaller than j, you still shouldn't use that point to interpolate with, since that would result in an unexpected average and graph. 即使最接近A [i + 1]的点的索引小于j，您仍然不应该使用该点进行插值，因为这会导致意外的平均值和图形。 I'll take a moment now to create some sample code for you. 我现在花点时间为您创建一些示例代码。 I hope you see that this optimalization makes sense. 我希望你看到这种优化是有道理的。

EDIT: While implementing this, I realised that j is not only bounded from below(by the method described above), but also bounded from above. 编辑：在实现这一点时，我意识到j不仅从下面（通过上述方法）限制，而且还从上面限定。 When you try the distance from A[i+1] to B[j], B[j+1], B[j+2] etc, you can stop comparing when the distance A[i+1] to B[j+...] stops decreasing. 当你尝试从A [i + 1]到B [j]，B [j + 1]，B [j + 2]等的距离时，你可以停止比较距离A [i + 1]到B [j + ...]停止减少。 There's no point in searching further in B. The same reasoning applies as when j was bounded from below: even if some point elsewhere in B would be closer, that's probably not the point you want to interpolate with. 在B中进一步搜索是没有意义的。同样的推理适用于j从下面开始的界限：即使B中其他地方的某个点更接近，这可能不是你想要插入的点。 Doing so would result in an unexpected graph, probably less smooth than you'd expect. 这样做会导致意外的图形，可能不如您预期的那么平滑。 And an extra bonus of this second bound is the improved performance. 而第二个限制的额外奖励是提高了性能。 I've created the following code: 我创建了以下代码：

IEnumerable<Tuple<double, double>> Average(List<Tuple<double, double>> A, List<Tuple<double, double>> B)
{
    if (A == null || B == null || A.Any(p => p == null) || B.Any(p => p == null)) throw new ArgumentException();
    Func<double, double> square = d => d * d;//squares its argument
    Func<int, int, double> euclidianDistance = (a, b) => Math.Sqrt(square(A[a].Item1 - B[b].Item1) + square(A[a].Item2 - B[b].Item2));//computes the distance from A[first argument] to B[second argument]

    int previousIndexInB = 0;
    for (int i = 0; i < A.Count; i++)
    {
        double distance = euclidianDistance(i, previousIndexInB);//distance between A[i] and B[j - 1], initially 
        for (int j = previousIndexInB + 1; j < B.Count; j++)
        {
            var distance2 = euclidianDistance(i, j);//distance between A[i] and B[j]
            if (distance2 < distance)//if it's closer than the previously checked point, keep searching. Otherwise stop the search and return an interpolated point.
            {
                distance = distance2;
                previousIndexInB = j;
            }
            else
            {
                break;//don't place the yield return statement here, because that could go wrong at the end of B.
            }
        }
        yield return LinearInterpolation(A[i], B[previousIndexInB]);
    }
}
Tuple<double, double> LinearInterpolation(Tuple<double, double> a, Tuple<double, double> b)
{
    return new Tuple<double, double>((a.Item1 + b.Item1) / 2, (a.Item2 + b.Item2) / 2);
}

For your information, the function Average returns the same amount of interpolated points the list A contains, which is probably fine, but you should think about this for your specific application. 对于您的信息，函数Average返回列表A包含的相同数量的插值点，这可能很好，但您应该考虑针对您的特定应用程序。 I've added some comments in it to clarify some details, and I've described all aspects of this code in the text above. 我在其中添加了一些注释以澄清一些细节，我在上面的文本中描述了此代码的所有方面。 I hope it's clear, and otherwise feel free to ask questions. 我希望它很清楚，否则可以随意提问。

SECOND EDIT: I misread and thought you had only two lists of points. 第二次编辑：我误读并认为你只有两个积分列表。 I have created a generalised function of that above accepting multiple lists. 我创建了一个上面接受多个列表的通用函数。 It still uses only those principles explained above. 它仍然只使用上面解释的那些原则。

IEnumerable<Tuple<double, double>> Average(List<List<Tuple<double, double>>> data)
{
    if (data == null || data.Count < 2 || data.Any(list => list == null || list.Any(p => p == null))) throw new ArgumentException();
    Func<double, double> square = d => d * d;
    Func<Tuple<double, double>, Tuple<double, double>, double> euclidianDistance = (a, b) => Math.Sqrt(square(a.Item1 - b.Item1) + square(a.Item2 - b.Item2));

    var firstList = data[0];
    for (int i = 0; i < firstList.Count; i++)
    {
        int[] previousIndices = new int[data.Count];//the indices of points which are closest to the previous point firstList[i - 1]. 
        //(or zero if i == 0). This is kept track of per list, except the first list.
        var closests = new Tuple<double, double>[data.Count];//an array of points used for caching, of which the average will be yielded.
        closests[0] = firstList[i];
        for (int listIndex = 1; listIndex < data.Count; listIndex++)
        {
            var list = data[listIndex];
            double distance = euclidianDistance(firstList[i], list[previousIndices[listIndex]]);
            for (int j = previousIndices[listIndex] + 1; j < list.Count; j++)
            {
                var distance2 = euclidianDistance(firstList[i], list[j]);
                if (distance2 < distance)//if it's closer than the previously checked point, keep searching. Otherwise stop the search and return an interpolated point.
                {
                    distance = distance2;
                    previousIndices[listIndex] = j;
                }
                else
                {
                    break;
                }
            }
            closests[listIndex] = list[previousIndices[listIndex]];
        }
        yield return new Tuple<double, double>(closests.Select(p => p.Item1).Average(), closests.Select(p => p.Item2).Average());
    }
}

Actually that I did the specific case for 2 lists separately might have been a good thing: it is easily explained and offers a step before understanding the generalised version. 实际上，我分别为2个列表做了具体案例可能是一件好事：它很容易解释并在理解通用版本之前提供了一个步骤。 Furthermore, the square root could be taken out, since it doesn't change the order of the distances when sorted, just the lengths. 此外，可以取出平方根，因为它不会改变排序时的距离顺序，只改变长度。

THIRD EDIT: In the comments it became clear there might be a bug. 第三次编辑：在评论中，很明显可能存在错误。 I think there are none, aside from the mentioned small bug, which shouldn't make any difference except for at the end of the graphs. 我认为除了上面提到的小虫子之外没有其他的东西，除了图表的末尾之外不应该有任何区别。 As a proof that it actually works, this is the result of it(the dotted line is the average): 作为它实际工作的证明，这是它的结果（虚线是平均值）： 在此输入图像描述

Answer 3

As these are not y=f(x) functions, are they perhaps something like (x,y)=f(t) ? 由于这些不是y=f(x)函数，它们可能是(x,y)=f(t)吗？

If so, you could interpolate along t, and calculate avg(x) and avg(y) for each t. 如果是这样，你可以沿t插值，并为每个t计算avg（x）和avg（y）。

EDIT This of course assumes that t can be made available to your code - so that you have an ordered list of T/X/Y triples. 编辑这当然假设t可以用于您的代码 - 这样您就有了T / X / Y三元组的有序列表。

Answer 4

There are several ways this can be done. 有几种方法可以做到这一点。 One is to combine all of your data into one single set of points, and do a best-fit curve through the combined set. 一种是将所有数据组合成一组单独的点，并通过组合集做出最佳拟合曲线。

Answer 5

you have eg 2 "functions" with 你有2个“功能”

fc1 = { {1,0.3} {2, 0.5} {3, 0.1} }
fc1 = { {1,0.1} {2, 0.8} {3, 0.4} }

You want the arithmetic mean (slang: "average") of the two functions. 你想要两个函数的算术平均值（俚语：“平均值”）。 To do this you just calculate the pointwise arithmetic mean: 为此，您只需计算逐点算术平均值：

fc3 = { {1, (0.3+0.1)/2} ... }

Optimization: If you have large numbers of points you should first convert your "ordered List of X/Y Pairs" into a Matrix OR at least store the points column-wise like so: {0.3, 0.1}, {0.5, 0.8}, {0.1, 0.4} 优化：如果您有大量的点，您应该首先将“有序的X / Y对列表”转换为矩阵或至少按列存储点，如下所示：{0.3,0.1}，{0.5,0.8}， {0.1,0.4}

计算几个函数的平均函数

问题描述

5 个解决方案

解决方案1
4 2011-02-18 15:51:10

解决方案2
4 已采纳 2011-02-18 16:25:42

解决方案3
2 2011-02-18 15:12:00

解决方案4
0 2011-02-18 14:57:07

解决方案5
0 2011-02-18 15:03:18

计算几个函数的平均函数

问题描述

5 个解决方案

解决方案1 4 2011-02-18 15:51:10

解决方案2 4 已采纳 2011-02-18 16:25:42

解决方案3 2 2011-02-18 15:12:00

解决方案4 0 2011-02-18 14:57:07

解决方案5 0 2011-02-18 15:03:18

解决方案1
4 2011-02-18 15:51:10

解决方案2
4 已采纳 2011-02-18 16:25:42

解决方案3
2 2011-02-18 15:12:00

解决方案4
0 2011-02-18 14:57:07

解决方案5
0 2011-02-18 15:03:18