简体   繁体   中英

Getting List of Objects that occurs exaclty twice in a list

I have a List<CustomPoint> points; which contains close to million objects. From this list I would like to get the List of objects that are occuring exactly twice. What would be the fastest way to do this? I would also be interested in a non-Linq option also since I might have to do this in C++ also.

public class CustomPoint
{
    public double X { get; set; }
    public double Y { get; set; }

    public CustomPoint(double x, double y)
    {
        this.X = x;
        this.Y = y;
    }
}

public class PointComparer : IEqualityComparer<CustomPoint>
{
    public bool Equals(CustomPoint x, CustomPoint y)
    {
        return ((x.X == y.X) && (y.Y == x.Y));
    }

    public int GetHashCode(CustomPoint obj)
    {
        int hash = 0;
        hash ^= obj.X.GetHashCode();
        hash ^= obj.Y.GetHashCode();
        return hash;
    }
}

based on this answer, i tried,

list.GroupBy(x => x).Where(x => x.Count() = 2).Select(x => x.Key).ToList(); 

but this is giving zero objects in the new list. Can someone guide me on this?

您应该在类本身而不是PointComparer中实现Equals和GetHashCode

要使代码正常工作,您需要将PointComparer的实例作为第二个参数传递给GroupBy

This method works for me:

public class PointCount
{
    public CustomPoint Point { get; set; }
    public int Count { get; set; }
}

private static IEnumerable<CustomPoint> GetPointsByCount(Dictionary<int, PointCount> pointcount, int count)
{
    return pointcount
                    .Where(p => p.Value.Count == count)
                    .Select(p => p.Value.Point);
}

private static Dictionary<int, PointCount> GetPointCount(List<CustomPoint> pointList)
{
    var allPoints = new Dictionary<int, PointCount>();

    foreach (var point in pointList)
    {
        int hash = point.GetHashCode();

        if (allPoints.ContainsKey(hash))
        {
            allPoints[hash].Count++;
        }
        else
        {
            allPoints.Add(hash, new PointCount { Point = point, Count = 1 });
        }
    }

    return allPoints;
}

Called like this:

static void Main(string[] args)
{
    List<CustomPoint> list1 = CreateCustomPointList();

    var doubles = GetPointsByCount(GetPointCount(list1), 2);

    Console.WriteLine("Doubles:");
    foreach (var point in doubles)
    {
        Console.WriteLine("X: {0}, Y: {1}", point.X, point.Y);
    }
}

private static List<CustomPoint> CreateCustomPointList()
{
    var result = new List<CustomPoint>();

    for (int i = 0; i < 5; i++)
    {
        for (int j = 0; j < 5; j++)
        {
            result.Add(new CustomPoint(i, j));
        }
    }

    result.Add(new CustomPoint(1, 3));
    result.Add(new CustomPoint(3, 3));
    result.Add(new CustomPoint(0, 2));

    return result;
}

CustomPoint implementation:

public class CustomPoint
{
    public double X { get; set; }
    public double Y { get; set; }

    public CustomPoint(double x, double y)
    {
        this.X = x;
        this.Y = y;
    }

    public override bool Equals(object obj)
    {
        var other = obj as CustomPoint;

        if (other == null)
        {
            return base.Equals(obj);
        }

        return ((this.X == other.X) && (this.Y == other.Y));
    }

    public override int GetHashCode()
    {
        int hash = 23;
        hash = hash * 31 + this.X.GetHashCode();
        hash = hash * 31 + this.Y.GetHashCode();
        return hash;
    }
}

It prints:

Doubles:
X: 0, Y: 2
X: 1, Y: 3
X: 3, Y: 3

As you see in GetPointCount() , I create a dictionary per unique CustomPoint (by hash). Then I insert a PointCount object containing a reference to the CustomPoint which starts at a Count of 1, and every time the same point is encountered, the Count is increased.

Finally in GetPointsByCount I return the CustomPoint s in the dictionary where PointCount.Count == count , in your case 2.

Please also note I updated the GetHashCode() method, since your one returns the same for point (1,2) and (2,1). If you do want that, feel free to restore your own hashing method. You will have to test the hashing function though, because it's hard to uniquely hash two numbers into one. That depends on the range of numbers used though, so you should implement a hash function that fits your own needs.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM