简体   繁体   中英

What is the Pythonic way to find the longest common prefix of a list of lists?

Given : a list of lists, such as [[3,2,1], [3,2,1,4,5], [3,2,1,8,9], [3,2,1,5,7,8,9]]

Todo : Find the longest common prefix of all sublists.

Exists : In another thread " Common elements between two lists not using sets in Python ", it is suggested to use "Counter", which is available above python 2.7. However our current project was written in python 2.6, so "Counter" is not used.

I currently code it like this:

l = [[3,2,1], [3,2,1,4,5], [3,2,1,8,9], [3,2,1,5,7,8,9]]
newl = l[0]
if len(l)>1:
    for li in l[1:]:
    newl = [x for x in newl if x in li]

But I find it not very pythonic, is there a better way of coding?

Thanks!

New edit : Sorry to mention: in my case, the shared elements of the lists in 'l' have the same order and alway start from the 0th item. So you wont have cases like [[1,2,5,6],[2,1,7]]

os.path.commonprefix() works well for lists :)

>>> x = [[3,2,1], [3,2,1,4,5], [3,2,1,8,9], [3,2,1,5,7,8,9]]
>>> import os
>>> os.path.commonprefix(x)
[3, 2, 1]

I am not sure how pythonic it is

from itertools import takewhile,izip

x = [[3,2,1], [3,2,1,4,5], [3,2,1,8,9], [3,2,1,5,7,8,9]]

def allsame(x):
    return len(set(x)) == 1

r = [i[0] for i in takewhile(allsame ,izip(*x))]

Given your example code, you seem to want a version of reduce(set.intersection, map(set, l)) that preserves the initial order of the first list.

This requires algorithmic improvements, not stylistic improvements; "pythonic" code alone won't do you any good here. Think about the situation that must hold for all values that occur in every list:

Given a list of lists, a value occurs in every list if and only if it occurs in nlist lists, where nlist is the total number of lists.

If we can guarantee that each value occurs only once in every list, then the above can be rephrased:

Given a list of lists of unique items, a value occurs in every list if and only if it occurs nlist times total.

We can use sets to guarantee that the items in our lists are unique, so we can combine this latter principle with a simple counting strategy:

>>> l = [[3,2,1], [3,2,1,4,5], [3,2,1,8,9], [3,2,1,5,7,8,9]]
>>> count = {}
>>> for i in itertools.chain.from_iterable(map(set, l)):
...     count[i] = count.get(i, 0) + 1
...     

Now all we have to do is filter the original list:

>>> [i for i in l[0] if count[i] == len(l)]
[3, 2, 1]

Here's an alternative way using itertools:

>>> import itertools
>>> L = [[3,2,1,4], [3,2,1,4,5], [3,2,1,8,9], [3,2,1,5,7,8,9]]
>>> common_prefix = []
>>> for i in itertools.izip(*L):
...    if i.count(i[0]) == len(i):
...       common_prefix.append(i[0])
...    else:
...       break
... 
>>> common_prefix
[3, 2, 1]

Not sure how "pythonic" it might be considered though.

这是低效的,因为一旦发现不匹配就不会提前,但它的整洁:

([i for i,(j,k) in enumerate(zip(a,b)) if j!=k] or [0])[0]

An modernized Vertical scan solution using a generator expression and Python 3's builtin zip :

lst = [[3,2,1], [3,2,1,1,5], [3,2,1,8,9], [3,2,1,5,7,8,9]]

next(zip(*(x for x in zip(*lst) if len(set(x)) == 1)))
# (3, 2, 1)

See also a related Leetcode problem - Longest Common Prefix .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM