[英]Counting occurrences of words per each item in a list
Suppose I have a list: ['cat dog', 'cat cat', 'dog', 'cat cat cat'] 假设我有一个列表:['cat dog','cat cat','dog','cat cat cat']
I want the count for 'cat' to be 3 (unique per item in the list, not 6). 我希望“猫”的数量为3(列表中每个项目的唯一值,而不是6)。
I'm currently using: 我目前正在使用:
counts = [cat dog, cat cat, dog, cat cat cat]
for sentence in sequence_of_sentences:
counts.update(word for word in sentence.split())
Updated: Should be 3 instances of cat :) 更新:应该是cat的3个实例:)
I do not understand how you get 4
. 我不明白你怎么得到的4
。 Your example list 您的示例列表
>>>l=['cat dog', 'cat cat', 'dog', 'cat cat cat']
has 3
unique 'cat'
's. 有3
独特的'cat'
。 First, Second and Last element. 第一,第二和最后一个元素。 In case you want that, use 如果您需要,请使用
>>>sum(1 for i in l if 'cat' in i)
or as @holden excellently suggests (it would have never occurred to me) 或@holden很好地暗示了(这对我而言永远不会发生)
>>>sum(('cat' in i) for i in l)
which reads excellently. 读起来很棒。
Check out collections.Counter
and set
. 检出collections.Counter
并set
。 Counter
is very handy for creating tallies (aka. counting) and set
is great for removing duplicates from a sequence. Counter
对于创建计数(又称计数)非常方便,而set
对于从序列中删除重复项非常有用。
from collections import Counter
phrases = ['cat dog', 'cat cat', 'dog', 'cat cat cat']
all_counts = Counter()
occurrence_counts = Counter()
for phrase in phrases:
words = phrase.split()
distinct_words = set(words)
all_counts.update(words)
occurrence_counts.update(distinct_words)
all_counts['cat'] # 6
occurrence_counts['cat'] # 3
update()
updates the tallies based on what you pass it. update()
根据您传递的内容更新计数。
Play around with set
a bit by running python from from command line and you should get an idea for what is going on above: 通过从命令行运行python来进行一些set
,您应该对上面的情况有所了解:
$ python
>>> animals = [ 'bird', 'bird', 'cat' ]
>>> set(animals)
set(['bird', 'cat'])
>>> list(set(animals))
['bird', 'cat']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.