简体   繁体   English

C# 比较两个 collections 的更有效方法

[英]C# more efficient way of comparing two collections

I have two collections我有两个 collections

List<Car> currentCars = GetCurrentCars();
List<Car> newCars = GetNewCars();

I don't want to use foreach loop or something because i think there should be much better way of doing this.我不想使用 foreach 循环或其他东西,因为我认为应该有更好的方法来做到这一点。

I am looking for more efficient way to compare this collections and to get results:我正在寻找更有效的方法来比较这个 collections 并获得结果:

  1. List of cars which are in newCars and not in currentCars在 newCars 中而不在 currentCars 中的汽车列表
  2. List of cars which are not in newCars and in currentCars不在 newCars 和 currentCars 中的汽车列表

Type Car has int property Id.类型 Car 具有 int 属性 ID。

There was an answer, which is already deleted saying What i mean by saying efficient: less code, less mechanics, and more readable cases有一个答案,已被删除,说我所说的高效是什么意思:更少的代码,更少的机制和更易读的案例

So thinking this way what is the cases i have?所以这样想我有什么案例?

What would be less code, less mechanics, and more readable cases?什么是更少的代码、更少的机制和更易读的案例?

You can do it like this:你可以这样做:

// 1) List of cars in newCars and not in currentCars
var newButNotCurrentCars = newCars.Except(currentCars);

// 2) List of cars in currentCars and not in newCars
var currentButNotNewCars = currentCars.Except(newCars);

The code uses the Enumerable.Except extension method (available in.Net 3.5 and over).该代码使用Enumerable.Except扩展方法(在 .Net 3.5 及更高版本中可用)。

I believe this fulfills your criteria of "less code, less mechanics, and more readable".我相信这符合您“更少代码、更少机制和更易读”的标准。

You can use Except :您可以使用Except

var currentCarsNotInNewCars = currentCars.Except(newCars);
var newCarsNotInCurrentCars = newCars.Except(currentCars);

But this has no performance benefit over the foreach solution.但这对foreach解决方案没有性能优势。 It just looks cleaner.它只是看起来更干净。
Also, be aware of the fact, that you need to implement IEquatable<T> for your Car class, so the comparison is done on the ID and not on the reference.另外,请注意,您需要为您的Car class 实现IEquatable<T> ,因此比较是在 ID 而非参考上进行的。

Performancewise, a better approach would be to not use a List<T> but a Dictionary<TKey, TValue> with the ID as the key:在性能方面,更好的方法是不使用List<T>而是使用以 ID 作为键的Dictionary<TKey, TValue>

var currentCarsDictionary = currentCars.ToDictionary(x => x.ID);
var newCarsDictionary = newCars.ToDictionary(x => x.ID);

var currentCarsNotInNewCars = 
    currentCarsDictionary.Where(x => !newCarsDictionary.ContainsKey(x.Key))
                         .Select(x => x.Value);

var newCarsNotInCurrentCars = 
    newCarsDictionary.Where(x => !currentCarsDictionary.ContainsKey(x.Key))
                     .Select(x => x.Value);

If you start with them in HashSet s you can use the Except method.如果您在HashSet中从它们开始,则可以使用Except方法。

HashSet<Car> currentCars = GetCurrentCars();
HashSet<Car> newCars = GetNewCars();

currentCars.Except(newCars);
newCars.Except(currentCars);

It would be much faster w/ a set than a list.它会比一个列表快得多。 (Under the hood a list is just doing a foreach, sets can be optimized). (在引擎盖下,列表只是在做一个 foreach,可以优化集合)。

I'd override the Equals of a Car to compare by id and then you could use the IEnumerable.Except extension method.我会覆盖CarEquals以按 id 进行比较,然后您可以使用IEnumerable.Except扩展方法。 If you can't override the Equals you can create your own IEqualityComparer<Car> which compares two cars by id.如果您不能覆盖Equals ,您可以创建自己的IEqualityComparer<Car> ,它通过 id 比较两辆汽车。

class CarComparer : IEqualityComparer<Car>
{
    public bool Equals(Car x, Car y)
    {
        return x != null && y != null && x.Id == y.Id;
    }

    public int GetHashCode(Car obj)
    {
        return obj == null ? 0 : obj.Id;
    }
}

You can use LINQ...您可以使用 LINQ...

        List<Car> currentCars = new List<Car>();
        List<Car> newCars = new List<Car>();

        List<Car> currentButNotNew = currentCars.Where(c => !newCars.Contains(c)).ToList();
        List<Car> newButNotCurrent = newCars.Where(c => !currentCars.Contains(c)).ToList();

...but do not be fooled. ...但不要被愚弄。 It may be less code for you, but there will definitely be some for loops in there somewhere它可能对你来说代码更少,但肯定会有一些 for 循环在那里

EDIT: Didn't realise there was an Except method:(编辑:没有意识到有一个例外方法:(

If you're looking for efficency, implement IComparable on Cars (sorting on your unique ID) and use a SortedList.如果您正在寻找效率,请在 Cars 上实现 IComparable(按您的唯一 ID 排序)并使用 SortedList。 You can then walk through your collections together and evaluate your checks in O(n).然后,您可以一起浏览您的 collections 并在 O(n) 中评估您的检查。 This of course comes with an added cost to List inserts to maintain the sorted nature.这当然会增加 List 插入的成本,以保持排序的性质。

You can copy the smaller list into an hash table based collection like HashSet or Dictionary and then iterate over the second list and check if the item exists in the hash table.您可以将较小的列表复制到基于 hash 表的集合(如 HashSet 或 Dictionary)中,然后遍历第二个列表并检查该项目是否存在于 hash 表中。

this will reduce the time from O(N^2) in the naive foreach inside foreach case to O(N).这将在 foreach 案例中从天真的 foreach 中的 O(N^2) 减少到 O(N) 的时间。

This is the best you can do without knowing more about the lists (you may be able to do a little better if the lists are sorted for example, but, since you have to "touch" each car at least once to check if it's on the new car list you can never do better than O(N))这是您在不了解列表的情况下可以做最好的事情(例如,如果列表经过排序,您可能会做得更好,但是,因为您必须至少“触摸”每辆车一次以检查它是否在新车清单你永远不能做得比 O(N))

If a comparison of the Id property will suffice you to say if a Car is equal to another, in order to avoid some sort of loop, you could override the List with your own class that keeps track of the items and uses the IEqualityComparer on the entire collection, like this:如果 Id 属性的比较足以说明 Car 是否与另一个 Car 相等,为了避免某种循环,您可以使用自己的 class 覆盖列表,该列表跟踪项目并使用IEqualityComparer整个集合,像这样:

class CarComparer : IList<Car>, IEquatable<CarComparer>
{
    public bool Equals(CarComparer other)
    {
        return object.Equals(GetHashCode(),other.GetHashCode());
    }

    public override int GetHashCode()
    {
        return _runningHash;
    }

    public void Insert(int index, Car item)
    {
        // Update _runningHash here
        throw new NotImplementedException();
    }

    public void RemoveAt(int index)
    {
        // Update _runningHash here
        throw new NotImplementedException();
    }

    // More IList<Car> Overrides ....
}

Then, you just need to override the Add , Remove , etc and any other methods that might affect the items in the list.然后,您只需要覆盖AddRemove等以及任何其他可能影响列表中项目的方法。 You can then keep a private variable that is a hash of some sort of the Ids of the items in the list.然后,您可以保留一个私有变量,它是列表中某些项目的 Id 的 hash。 When overriding your Equals methods you can then just compare this private variable.当覆盖你的Equals方法时,你可以只比较这个私有变量。 Not the cleanest approach by far (as you have to keep up with your hash variable), but it will result in you not having to loop through to do a comparison.到目前为止,这不是最干净的方法(因为您必须跟上您的 hash 变量),但它会导致您不必循环进行比较。 If it were me, I would just use Linq as some have mentioned here...如果是我,我会使用 Linq 就像有些人在这里提到的那样......

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM