计算列表中每个项目的单词出现次数

Question

Suppose I have a list: ['cat dog', 'cat cat', 'dog', 'cat cat cat'] 假设我有一个列表：['cat dog'，'cat cat'，'dog'，'cat cat cat']

I want the count for 'cat' to be 3 (unique per item in the list, not 6). 我希望“猫”的数量为3（列表中每个项目的唯一值，而不是6）。

I'm currently using: 我目前正在使用：

counts = [cat dog, cat cat, dog, cat cat cat]
for sentence in sequence_of_sentences:
    counts.update(word for word in sentence.split())

Updated: Should be 3 instances of cat :) 更新：应该是cat的3个实例：）

Answer 1

I do not understand how you get 4 . 我不明白你怎么得到的4 。 Your example list 您的示例列表

>>>l=['cat dog', 'cat cat', 'dog', 'cat cat cat']

has 3 unique 'cat' 's. 有3独特的'cat' 。 First, Second and Last element. 第一，第二和最后一个元素。 In case you want that, use 如果您需要，请使用

>>>sum(1 for i in l if 'cat' in i)

or as @holden excellently suggests (it would have never occurred to me) 或@holden很好地暗示了（这对我而言永远不会发生）

>>>sum(('cat' in i) for i in l)

which reads excellently. 读起来很棒。

Answer 2

Check out collections.Counter and set . 检出collections.Counter并set 。 Counter is very handy for creating tallies (aka. counting) and set is great for removing duplicates from a sequence. Counter对于创建计数（又称计数）非常方便，而set对于从序列中删除重复项非常有用。

from collections import Counter

phrases = ['cat dog', 'cat cat', 'dog', 'cat cat cat']    
all_counts = Counter()
occurrence_counts = Counter()

for phrase in phrases:
    words = phrase.split()
    distinct_words = set(words)
    all_counts.update(words)
    occurrence_counts.update(distinct_words)

all_counts['cat']        # 6
occurrence_counts['cat'] # 3

update() updates the tallies based on what you pass it. update()根据您传递的内容更新计数。

Play around with set a bit by running python from from command line and you should get an idea for what is going on above: 通过从命令行运行python来进行一些set ，您应该对上面的情况有所了解：

$ python
>>> animals = [ 'bird', 'bird', 'cat' ]
>>> set(animals)
set(['bird', 'cat'])
>>> list(set(animals))
['bird', 'cat']

计算列表中每个项目的单词出现次数

问题描述

2 个解决方案

解决方案1
2 2014-03-03 23:32:49

解决方案2
0 已采纳 2014-03-04 00:01:25

计算列表中每个项目的单词出现次数

问题描述

2 个解决方案

解决方案1 2 2014-03-03 23:32:49

解决方案2 0 已采纳 2014-03-04 00:01:25

解决方案1
2 2014-03-03 23:32:49

解决方案2
0 已采纳 2014-03-04 00:01:25