[英]Efficient way of comparing multiple lists in python
我有5个带有单词对的长列表,如下例所示。 请注意,这可能包括单词对列表,例如[['Salad', 'Fat']]
和单词对列表,例如[['Bread', 'Oil'], ['Bread', ' Salt']]
list_1 = [ [['Salad', 'Fat']], [['Bread', 'Oil'], ['Bread', 'Salt']], [['Salt', 'Sugar'] ]
list_2 = [ [['Salad', 'Fat'], ['Salt', 'Sugar']], [['Protein', 'Soup']] ]
list_3 = [ [['Salad', ' Protein']], [['Bread', ' Oil']], [['Sugar', 'Salt'] ]
list_4 = [ [['Salad', ' Fat'], ['Salad', 'Chicken']] ]
list_5 = [ ['Sugar', 'Protein'], ['Sugar', 'Bread'] ]
现在我要计算单词对的频率。
例如,在上面的5个列表中,我应该得到如下输出,其中显示单词对及其频率。
output_list = [{'['Salad', 'Fat']': 3}, {['Bread', 'Oil']: 2}, {['Salt', 'Sugar']: 2,
{['Sugar','Salt']: 1} and so on]
在python中最有效的方法是什么?
>>> import itertools
>>> from collections import Counter
>>> l = [[1,2,3],[3,4,1,5]]
>>> counts = Counter(list(itertools.chain(*l)))
>>> counts
Counter({1: 2, 3: 2, 2: 1, 4: 1, 5: 1})
注意:此拼合技术仅适用于列表列表。 有关其他展平技术,请参见上面提供的链接。
编辑:由于AChampion counts = Counter(list(itertools.chain(*l)))
it counts = Counter(list(itertools.chain(*l)))
可以写为counts = Counter(list(itertools.chain.from_iterable(l)))
鉴于您有不均匀的嵌套列表,因此这会使代码难看,因此请尝试修复输入列表。
collections.Counter()
是专为这种事情,但list
s为没有哈希的,所以你需要把它们变成tuple
S(以及strip
断虚假空格):
In []:
import itertools as it
from collections import Counter
list_1 = [ [['Salad', 'Fat']], [['Bread', 'Oil'], ['Bread', 'Salt']], [['Salt', 'Sugar'] ]]
list_2 = [ [['Salad', 'Fat'], ['Salt', 'Sugar']], [['Protein', 'Soup']] ]
list_3 = [ [['Salad', ' Protein']], [['Bread', ' Oil']], [['Sugar', 'Salt'] ]]
list_4 = [ [['Salad', ' Fat'], ['Salad', 'Chicken']] ]
list_5 = [ ['Sugar', 'Protein'], ['Sugar', 'Bread']]
t = lambda x: tuple(map(str.strip, x))
c = Counter(map(t, it.chain.from_iterable(it.chain(list_1, list_2, list_3, list_4))))
c += Counter(map(t, list_5))
c
Out[]:
Counter({('Bread', 'Oil'): 2,
('Bread', 'Salt'): 1,
('Protein', 'Soup'): 1,
('Salad', 'Chicken'): 1,
('Salad', 'Fat'): 3,
('Salad', 'Protein'): 1,
('Salt', 'Sugar'): 2,
('Sugar', 'Bread'): 1,
('Sugar', 'Protein'): 1,
('Sugar', 'Salt'): 1})
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.