简体   繁体   中英

Removing duplicates from List<double[]>

I am trying to remove the duplicates from a list of double arrays. I would like to keep the first instance of the duplicate but remove any found after.

Here is my code:

private static List<double[]> RemoveDupes(List<double[]> locData)
    {
        List<double[]> list = locData;
        while (ContainsDupes(list))
            for (int a = 0; a < list.Count; a++)
                for (int b = 0; b < list.Count; b++)
                    if (a != b && list[a][0] == list[b][0] && list[a][1] == list[b][1])
                        list.RemoveAt(b);

        return list;
    }
private static bool ContainsDupes(List<double[]> list)
    {
        for (int a = 0; a < list.Count; a++)
            for (int b = 0; b < list.Count; b++)
                if (a != b && list[a][0] == list[b][0] && list[a][1] == list[b][1])
                    return true;
        return false;
    }

This method works almost all of the time but it's slow and in edge cases (1 out of a few thousand) it crashes my program with an index exception on line 6. I can't think of any other way to do this so any help would be appreciated.

Input:

{{45.5, 23.6}, {34.54, 98.34}, {45.5, 23.6}}

Desired output:

{{45.5, 23.6}, {34.54, 98.34}}

(length of the double[] is always 2)

Since you've stated that array will always be size of 2, I suggest you to use different data type. For example, tuple would be more appropriate, because these are actually pairs of values.

For example, you could define a collection of pairs:

List<(double, double)> pairs = new List<(double, double)>(); //C# 7.1+

List<Tuple<double, double>> pairsCollection = new List<Tuple<double, double>>(); // C# 7 or less

Seed it in this manner:

pairs.Add((45.5, 23.6));
pairs.Add((34.54, 98.34));
pairs.Add((45.5, 23.6));

And than, just use Distinct method, to remove duplicates:

pairs.Distinct();

This would output:

{{45.5, 23.6}, {34.54, 98.34}}

In addition, if you are not able to change the data type, you can project the collection into collection of pairs, and than distinct it:

List<double[]> collection = new List<double[]>()
{
    new double[]{45.5, 23.6},
    new double[]{34.54, 98.34},
    new double[]{45.5, 23.6}
};
var pairs = collection.Select(pa => (pa[0], pa[1])); 
var distinctPairs = pairs.Distinct();

You could use https://docs.microsoft.com/en-us/dotnet/api/system.linq.enumerable.sequenceequal?redirectedfrom=MSDN&view=netframework-4.8#System_Linq_Enumerable_SequenceEqual__1_System_Collections_Generic_IEnumerable___0__System_Collections_Generic_IEnumerable___0__

var l = new List<int[]>(){
            new int[]{5,4,3},
            new int[]{5,4,3},
            new int[]{5,4,2},
            };

            var indexStore = new List<int>();

            for (int i = 0; i < l.Count - 1; i++)
            {
                for (int x = i + 1; x < l.Count-1; x++)
                {
                    if (l[i].SequenceEqual(l[x]))
                    {
                        indexStore.Add(x);
                    }
                }
            }

            foreach (var index in indexStore)
            {
                l.RemoveAt(index);
            }

Do not remove while looping better store the duplicate indexes

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM