简体   繁体   English

列表查找性能 - 返回列表的最后一个元素是否必须扫描整个列表?

[英]List Lookup Performance - Does returning the last element of a list have to scan through the whole list?

Let's say I have a dictionary: 假设我有一本字典:

myDict = {
    'title': 'a nice title',
    'nice_list': [1,2,3,4,5,6,6,7,...,99999],
    'nice_lists_last_item': 99999,
}

I only want to append an item to nice_list if it is larger than the final item. 我只想将项目附加到nice_list如果它大于最终项目。

What is quicker: 什么是更快:

  1. Using: if new_element > nice_list[-1] 使用: if new_element > nice_list[-1]

or 要么

  1. Using: if new_element > nice_lists_last_item 使用: if new_element > nice_lists_last_item

Does method 1 have to scan through the whole list (and/or put all of nice_list into memory each time) to find that item? 方法1是否必须扫描整个列表(和/或每次将所有nice_list放入内存中)才能找到该项? Which is quicker? 哪个更快? (bearing in mind I intend to do a few billion of these comparisons?) (记住我打算做几十亿次这些比较?)

Method 2 would store the last element as its own distinct dict entry, so is that faster? 方法2将最后一个元素存储为自己独特的dict条目,那么更快吗?

When in doubt, test: 如有疑问,请测试:

>>> %timeit if 1 > myDict['nice_list'][-1]: 0
10000000 loops, best of 3: 110 ns per loop
>>> %timeit if 1 > myDict['nice_lists_last_item']: 0
10000000 loops, best of 3: 68.8 ns per loop
>>> nice_list = myDict['nice_list']
>>> %timeit if 1 > nice_list[-1]: 0
10000000 loops, best of 3: 62.6 ns per loop
>>> nice_lists_last_item = myDict['nice_lists_last_item']
>>> %timeit if 1 > nice_lists_last_item: 0                      
10000000 loops, best of 3: 43.4 ns per loop

As you can see, accessing the dictionary value directly is faster than accessing the list from the dictionary and then accessing its last value. 如您所见,直接访问字典值比从字典访问列表然后访问其最后一个值更快。 But accessing the last value of the list directly is faster. 但是直接访问列表的最后一个值会更快。 This should be no surprise; 这应该不足为奇; Python lists know their own length and are implemented in memory as arrays, so finding the last item is as simple as subtracting 1 from the length and doing pointer arithmetic. Python列表知道它们自己的长度,并在内存中实现为数组,因此查找最后一项就像从长度中减去1并进行指针算术一样简单。 Accessing dictionary keys is a bit slower because of the overhead of collision detection; 由于碰撞检测的开销,访问字典键有点慢; but it's only slower by a few nanoseconds. 但它只会慢几纳秒。 And finally, if you really want to save a few more nanoseconds, you could store the last value in its own value. 最后,如果你真的想节省几纳秒,你可以将最后一个值存储在它自己的值中。

The biggest slowdown comes when you do both . 当你同时做件事时,最大的放缓。

Getting an item from a list is O(1) as noted here . 获取一个项目从列表中为O(1)注意这里 Even so, storing the value explicitly will still be faster, because no matter how fast the lookup is, it's still going to be slower than not doing a lookup at all. 即便如此,显式存储值仍然会更快,因为无论查找速度有多快,它仍然会比不执行查找更慢。 (However, if you store the value explicitly, you'll have to update it when you add a new item to the list; whether the combined cost if updating it and checking it is more than the cost of grabbing the last item every time is something you'll have to benchmark yourself; it will likely depend on how often you wind up actually appending a new item.) (但是,如果您明确存储该值,则在向列表中添加新项时必须更新它;如果更新它并检查它的合并成本是否超过每次抓取最后一项的成本是您必须自己进行基准测试;这可能取决于您实际添加新项目的频率。)

Note that there is no question of "putting all of nice_list into memory". 请注意,不存在“将所有nice_list放入内存”的问题。 If you have a dict with a list in it, that entire list is already in memory. 如果你有一个带有列表的dict,那整个列表已经在内存中了。 Looking up a value in it won't cause it to take up any more memory, but if you have billions of these lists, you will run out of memory before you even try to look anything up, because just creating the lists will use up too much memory. 查找其中的值不会导致它占用更多的内存,但是如果你有数十亿的这些列表,你甚至会在尝试查找任何内容之前耗尽内存,因为只是创建列表会耗尽记忆太多了。

In CPython, the answer is probably no. 在CPython中,答案可能是否定的。 list is implemented using dynamic arrays. list是使用动态数组实现的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM