在python中搜索嵌套列表的最有效方法是什么？

Question

I have a list that contains nested lists and I need to know the most efficient way to search within those nested lists. 我有一个包含嵌套列表的列表，我需要知道在这些嵌套列表中搜索的最有效方法。

eg, if I have 例如，如果我有

[['a','b','c'],
['d','e','f']]

and I have to search the entire list above, what is the most efficient way to find 'd'? 我必须搜索上面的整个列表，找到'd'的最有效方法是什么？

Answer 1

>>> lis=[['a','b','c'],['d','e','f']]
>>> any('d' in x for x in lis)
True

generator expression using any 生成器表达使用any

$ python -m timeit -s "lis=[['a','b','c'],['d','e','f'],[1,2,3],[4,5,6],[7,8,9],[10,11,12],[13,14,15],[16,17,18]]" "any('d' in x for x in lis)" 
1000000 loops, best of 3: 1.32 usec per loop

generator expression 发电机表达

$ python -m timeit -s "lis=[['a','b','c'],['d','e','f'],[1,2,3],[4,5,6],[7,8,9],[10,11,12],[13,14,15],[16,17,18]]" "'d' in (y for x in lis for y in x)"
100000 loops, best of 3: 1.56 usec per loop

list comprehension 列表理解

$ python -m timeit -s "lis=[['a','b','c'],['d','e','f'],[1,2,3],[4,5,6],[7,8,9],[10,11,12],[13,14,15],[16,17,18]]" "'d' in [y for x in lis for y in x]"
100000 loops, best of 3: 3.23 usec per loop

How about if the item is near the end, or not present at all? 如果物品接近结束或根本不存在怎么样？ any is faster than the list comprehension any比列表理解更快

$ python -m timeit -s "lis=[['a','b','c'],['d','e','f'],[1,2,3],[4,5,6],[7,8,9],[10,11,12],[13,14,15],[16,17,18]]"
    "'NOT THERE' in [y for x in lis for y in x]"
100000 loops, best of 3: 4.4 usec per loop

$ python -m timeit -s "lis=[['a','b','c'],['d','e','f'],[1,2,3],[4,5,6],[7,8,9],[10,11,12],[13,14,15],[16,17,18]]" 
    "any('NOT THERE' in x for x in lis)"
100000 loops, best of 3: 3.06 usec per loop

Perhaps if the list is 1000 times longer? 也许如果列表长1000倍？ any is still faster any仍然更快

$ python -m timeit -s "lis=1000*[['a','b','c'],['d','e','f'],[1,2,3],[4,5,6],[7,8,9],[10,11,12],[13,14,15],[16,17,18]]"
    "'NOT THERE' in [y for x in lis for y in x]"
100 loops, best of 3: 3.74 msec per loop
$ python -m timeit -s "lis=1000*[['a','b','c'],['d','e','f'],[1,2,3],[4,5,6],[7,8,9],[10,11,12],[13,14,15],[16,17,18]]" 
    "any('NOT THERE' in x for x in lis)"
100 loops, best of 3: 2.48 msec per loop

We know that generators take a while to set up, so the best chance for the LC to win is a very short list 我们知道发电机需要一段时间来设置，因此LC获胜的最佳机会是一个非常短的列表

$ python -m timeit -s "lis=[['a','b','c']]"
    "any('c' in x for x in lis)"
1000000 loops, best of 3: 1.12 usec per loop
$ python -m timeit -s "lis=[['a','b','c']]"
    "'c' in [y for x in lis for y in x]"
1000000 loops, best of 3: 0.611 usec per loop

And any uses less memory too 而且any使用都会减少内存

Answer 2

Using list comprehension , given: 使用列表理解，给出：

mylist = [['a','b','c'],['d','e','f']]
'd' in [j for i in mylist for j in i]

yields: 收益率：

True

and this could also be done with a generator (as shown by @AshwiniChaudhary) 这也可以用发电机完成（如@AshwiniChaudhary所示）

Update based on comment below: 根据以下评论进行更新：

Here is the same list comprehension, but using more descriptive variable names: 这是相同的列表理解，但使用更多描述性的变量名称：

'd' in [elem for sublist in mylist for elem in sublist]

The looping constructs in the list comprehension part is equivalent to 列表推导部分中的循环结构等同于

for sublist in mylist:
   for elem in sublist

and generates a list that where 'd' can be tested against with the in operator. 并生成一个列表，其中'd'可以使用in运算符进行测试。

Answer 3

Use a generator expression, here the whole list will not be traversed as generator generate results one by one: 使用生成器表达式，这里不会遍历整个列表，因为生成器逐个生成结果：

>>> lis = [['a','b','c'],['d','e','f']]
>>> 'd' in (y for x in lis for y in x)
True
>>> gen = (y for x in lis for y in x)
>>> 'd' in gen
True
>>> list(gen)
['e', 'f']

~$ python -m timeit -s "lis=[['a','b','c'],['d','e','f'],[1,2,3],[4,5,6],[7,8,9],[10,11,12],[13,14,15],[16,17,18]]" "'d' in (y for x in lis for y in x)"
    100000 loops, best of 3: 2.96 usec per loop

~$ python -m timeit -s "lis=[['a','b','c'],['d','e','f'],[1,2,3],[4,5,6],[7,8,9],[10,11,12],[13,14,15],[16,17,18]]" "'d' in [y for x in lis for y in x]"
    100000 loops, best of 3: 7.4 usec per loop

Answer 4

If your arrays are always sorted as you show, so that a[i][j] <= a[i][j+1] and a[i][-1] <= a[i+1][0] (the last element of one array is always less than or equal to the first element in the next array), then you can eliminate a lot of comparisons by doing something like: 如果您的数组总是在显示时排序，那么a[i][j] <= a[i][j+1]和a[i][-1] <= a[i+1][0] （一个数组的最后一个元素总是小于或等于下一个数组中的第一个元素），那么你可以通过执行以下操作来消除大量的比较：

a = # your big array

previous = None
for subarray in a:
   # In this case, since the subarrays are sorted, we know it's not in
   # the current subarray, and must be in the previous one
   if a[0] > theValue:
      break
   # Otherwise, we keep track of the last array we looked at
   else:
      previous = subarray

return (theValue in previous) if previous else False

This kind of optimization is only worthwhile if you have a lot of arrays and they all have a lot of elements though. 如果你有很多数组并且它们都有很多元素，那么这种优化是值得的。

Answer 5

if you just want to know that your element is there in the list or not then you can do this by converting list to string and check it. 如果您只是想知道您的元素是否在列表中，那么您可以通过将list转换为字符串并检查它来完成此操作。 you can extend this of more nested list . 你可以扩展这个更嵌套的列表。 like [[1],'a','b','d',['a','b',['c',1]]] this method is helpful iff you dont know that level of nested list and want to know that is the searchable item is there or not. 比如[[1]，'a'，'b'，'d'，['a'，'b'，['c'，1]]]如果您不知道嵌套列表的级别，这个方法很有用想知道那是可搜索的项目是否存在。

    search='d'
    lis = [['a',['b'],'c'],[['d'],'e','f']]
    print(search in str(lis))

在python中搜索嵌套列表的最有效方法是什么？

问题描述

5 个解决方案

解决方案1
11 2012-08-15 04:43:18

解决方案2
5 已采纳 2012-08-15 03:24:08

解决方案3
4 2012-08-15 03:23:31

解决方案4
2 2012-08-15 03:37:53

解决方案5
0 2018-07-27 20:30:21

在python中搜索嵌套列表的最有效方法是什么？

问题描述

5 个解决方案

解决方案1 11 2012-08-15 04:43:18

解决方案2 5 已采纳 2012-08-15 03:24:08

解决方案3 4 2012-08-15 03:23:31

解决方案4 2 2012-08-15 03:37:53

解决方案5 0 2018-07-27 20:30:21

解决方案1
11 2012-08-15 04:43:18

解决方案2
5 已采纳 2012-08-15 03:24:08

解决方案3
4 2012-08-15 03:23:31

解决方案4
2 2012-08-15 03:37:53

解决方案5
0 2018-07-27 20:30:21