简体   繁体   English

从列表中删除重复项<double[]>

[英]Removing duplicates from List<double[]>

I am trying to remove the duplicates from a list of double arrays. 我正在尝试从双精度数组列表中删除重复项。 I would like to keep the first instance of the duplicate but remove any found after. 我想保留重复的第一个实例,但删除之后找到的所有实例。

Here is my code: 这是我的代码:

private static List<double[]> RemoveDupes(List<double[]> locData)
    {
        List<double[]> list = locData;
        while (ContainsDupes(list))
            for (int a = 0; a < list.Count; a++)
                for (int b = 0; b < list.Count; b++)
                    if (a != b && list[a][0] == list[b][0] && list[a][1] == list[b][1])
                        list.RemoveAt(b);

        return list;
    }
private static bool ContainsDupes(List<double[]> list)
    {
        for (int a = 0; a < list.Count; a++)
            for (int b = 0; b < list.Count; b++)
                if (a != b && list[a][0] == list[b][0] && list[a][1] == list[b][1])
                    return true;
        return false;
    }

This method works almost all of the time but it's slow and in edge cases (1 out of a few thousand) it crashes my program with an index exception on line 6. I can't think of any other way to do this so any help would be appreciated. 该方法几乎在所有时间都有效,但是它很慢,并且在极少数情况下(几千分之一)使我的程序崩溃,并在第6行出现了索引异常。我想不出任何其他方式来做到这一点,所以对您没有帮助将不胜感激。

Input: 输入:

{{45.5, 23.6}, {34.54, 98.34}, {45.5, 23.6}}

Desired output: 所需的输出:

{{45.5, 23.6}, {34.54, 98.34}}

(length of the double[] is always 2) (double []的长度始终为2)

Since you've stated that array will always be size of 2, I suggest you to use different data type. 既然您已经说过数组的大小将始终为2,所以建议您使用其他数据类型。 For example, tuple would be more appropriate, because these are actually pairs of values. 例如, 元组会更合适,因为它们实际上是一对值。

For example, you could define a collection of pairs: 例如,您可以定义对的集合:

List<(double, double)> pairs = new List<(double, double)>(); //C# 7.1+

List<Tuple<double, double>> pairsCollection = new List<Tuple<double, double>>(); // C# 7 or less

Seed it in this manner: 以这种方式播种:

pairs.Add((45.5, 23.6));
pairs.Add((34.54, 98.34));
pairs.Add((45.5, 23.6));

And than, just use Distinct method, to remove duplicates: 而且,仅使用Distinct方法即可删除重复项:

pairs.Distinct();

This would output: 这将输出:

{{45.5, 23.6}, {34.54, 98.34}} {{45.5,23.6},{34.54,98.34}}

In addition, if you are not able to change the data type, you can project the collection into collection of pairs, and than distinct it: 另外,如果您不能更改数据类型,则可以将集合投影为成对的集合,然后将其区分:

List<double[]> collection = new List<double[]>()
{
    new double[]{45.5, 23.6},
    new double[]{34.54, 98.34},
    new double[]{45.5, 23.6}
};
var pairs = collection.Select(pa => (pa[0], pa[1])); 
var distinctPairs = pairs.Distinct();

You could use https://docs.microsoft.com/en-us/dotnet/api/system.linq.enumerable.sequenceequal?redirectedfrom=MSDN&view=netframework-4.8#System_Linq_Enumerable_SequenceEqual__1_System_Collections_Generic_IEnumerable___0__System_Collections_Generic_IEnumerable___0__ 您可以使用https://docs.microsoft.com/zh-cn/dotnet/api/system.linq.enumerable.sequenceequal?redirectedfrom=MSDN&view=netframework-4.8#System_Linq_Enumerable_SequenceEqual__1_System_Collections_Generic_IEnumerable___0__System_Collections_Generic_IEnumerable__

var l = new List<int[]>(){
            new int[]{5,4,3},
            new int[]{5,4,3},
            new int[]{5,4,2},
            };

            var indexStore = new List<int>();

            for (int i = 0; i < l.Count - 1; i++)
            {
                for (int x = i + 1; x < l.Count-1; x++)
                {
                    if (l[i].SequenceEqual(l[x]))
                    {
                        indexStore.Add(x);
                    }
                }
            }

            foreach (var index in indexStore)
            {
                l.RemoveAt(index);
            }

Do not remove while looping better store the duplicate indexes 循环时不要删除,以更好地存储重复索引

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM