简体   繁体   English

哪个更快,为什么? 设置还是列出?

[英]Which is faster and why? Set or List?

Lets say that I have a graph and want to see if b in N[a] . 让我们说我有一个图表,想看看是否b in N[a] Which is the faster implementation and why? 哪个更快实现,为什么?

a, b = range(2)
N = [set([b]), set([a,b])]

OR 要么

N= [[b],[a,b]]

This is obviously oversimplified, but imagine that the graph becomes really dense. 这显然过于简单,但想象图表变得非常密集。

Membership testing in a set is vastly faster, especially for large sets. 集合中的成员资格测试速度要快得多,特别是对于大型集合。 That is because the set uses a hash function to map to a bucket. 这是因为该集使用散列函数映射到存储桶。 Since Python implementations automatically resize that hash table, the speed can be constant ( O(1) ) no matter the size of the set (assuming the hash function is sufficiently good). 由于Python实现自动调整该哈希表的大小,因此无论集合的大小如何(假设哈希函数足够好),速度都可以是常量( O(1) )。

In contrast, to evaluate whether an object is a member of a list, Python has to compare every single member for equality, ie the test is O(n) . 相反,为了评估对象是否是列表的成员,Python必须比较每个成员的相等性,即测试是O(n)

It all depends on what you're trying to accomplish. 这一切都取决于你想要完成的事情。 Using your example verbatim, it's faster to use lists, as you don't have to go through the overhead of creating the sets: 逐字使用您的示例,使用列表更快,因为您不必经历创建集合的开销:

import timeit

def use_sets(a, b):
    return [set([b]), set([a, b])]

def use_lists(a, b):
    return [[b], [a, b]]

t=timeit.Timer("use_sets(a, b)", """from __main__ import use_sets
a, b = range(2)""")
print "use_sets()", t.timeit(number=1000000)

t=timeit.Timer("use_lists(a, b)", """from __main__ import use_lists
a, b = range(2)""")
print "use_lists()", t.timeit(number=1000000)

Produces: 生产:

use_sets() 1.57522511482
use_lists() 0.783344984055

However, for reasons already mentioned here, you benefit from using sets when you are searching large sets. 但是,由于此处已提及的原因,在搜索大型集时,您将从使用集中受益。 It's impossible to tell by your example where that inflection point is for you and whether or not you'll see the benefit. 你的例子不可能告诉你哪个拐点适合你,以及你是否会看到这个好处。

I suggest you test it both ways and go with whatever is faster for your specific use-case. 我建议你两种方式进行测试,然后根据具体用例选择更快的方式。

Set ( I mean a hash based set like HashSet) is much faster than List to lookup for a value. Set(我的意思是基于哈希的集合,如HashSet)比List更快地查找值。 List has to go sequentially to find out if the value exists. 列表必须按顺序查找值是否存在。 HashSet can directly jump and locate the bucket and look up for a value almost in a constant time. HashSet可以直接跳转并定位存储桶,并在几乎恒定的时间内查找值。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM