简体   繁体   English

C#在列表中查找元素

[英]C# Finding an element in a List

Let's say I have the following C# code 假设我有以下C#代码

var my_list = new List<string>();
// Filling the list with tons of sentences.
string sentence = Console.ReadLine();

Is there any difference between doing either of the following ? 下列任何一项之间有什么区别?

bool c1 = my_list.Contains(sentence);
bool c2 = my_list.Any(s => s == sentence);

I imagine the pure algorithmic behind isn't exactly the same. 我想象背后的纯算法并不完全相同。 But what are the actual differences on my side? 但是我这边的实际区别是什么? Is one way faster or more efficient than the other? 一种方法比另一种方法更快或更有效吗? Will one method sometime return true and the other false? 一种方法有时会返回true,另一种会返回false吗? What should I consider to pick one method or the other? 我应该考虑选择一种方法还是另一种方法? Or is it purely up to me and both work in any situation? 还是纯粹由我决定并且在任何情况下都可以工作?

Realistically, the two will operate in almost the same fashion: iterate the list's items and check to see if sentence matches any list elements, giving a complexity of about O(n) . 实际上,两者将以几乎相同的方式运行:迭代列表的项,并检查句子是否与任何列表元素匹配,复杂度约为O(n) I would argue List.Contains since that is a little easier and more natural, but it's entirely preferential! 我会争辩List.Contains因为这样做比较容易和自然,但这是完全优先的!

Now, if you're looking for something faster in terms of lookup complexity and speed, I'd suggest a HashSet<T> . 现在,如果您要在查找复杂性和速度方面寻求更快的速度,我建议使用HashSet<T> HashSets have, generally speaking, a lookup of about O(1) since the hashing function, theoretically, should be a constant time operation. 一般来说,HashSets的查找约为O(1)因为从理论上讲,散列函数应该是恒定时间的操作。 Again, just a suggestion :) 再次,只是一个建议:)

The most upvoted answer isn't completely correct (and it's a reason big O doesn't always work). 最受欢迎的答案并不完全正确(这是大O并不总是有效的原因)。 Any will be slower than Contains in this scenario (by about double). 在这种情况下,任何方法都会比包含方法慢(大约两倍)。

Any will have an extra call every iteration, the delegate you specified on every item in your list, something contain does not have to do. 任何人每次迭代都会有一个额外的调用,即您在列表中每个项目上指定的委托,而不必包含。 An extra call will slow it down substantially. 额外的通话会大大降低它的速度。

The results will be the same, but the speed will be very different. 结果将是相同的,但是速度将有很大不同。

Example benchmark: 基准示例:

Stopwatch watch = new Stopwatch();

List<string> stringList = new List<string>();

for (int i = 0; i < 10000000; i++)
{
    stringList.Add(i.ToString());
}
int t = 0;
watch.Start();
for (int i = 0; i < 1000000; i++)
    if (stringList.Any(x => x == "29"))
        t = i;

watch.Stop();
("Any takes: " + watch.ElapsedMilliseconds).Dump();
GC.Collect();
watch.Restart();

for (int i = 0; i < 1000000; i++)
    if (stringList.Contains("29"))
        t = i;

watch.Stop();

("Contains takes: " + watch.ElapsedMilliseconds).Dump();

Results: 结果:

Any takes: 481
Contains takes: 235

Size and amount of iterations will not effect the % difference, Any will always be slower. 迭代的大小和数量不会影响%差异,Any总是较慢。

For string objects, there's no difference, since the == operator simply calls String.Equals . 对于string对象,没有区别,因为==运算符只是调用String.Equals

However, for other objects, there could be differences between == and .Equals - looking at the implementation of .Contains , it will use the EqualityComparer<T>.Default , which hooks into Equals(T) as long as you class implements IEquatable<T> (where T is itself). 但是,对于其他的对象,可能有差异==.Equals -看的实施.Contains ,它将使用EqualityComparer<T>.Default ,其中挂钩到Equals(T)为您的类实现,只要IEquatable<T> (其中T本身)。 Without overloading == , most classes instead use referential comparison for == since that's what they inherit from Object . 在不重载== ,大多数类改为对==使用引用比较,因为这是它们从Object继承的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM