简体   繁体   English

Python-计算单词列表中的每个字母

[英]Python- Count each letter in a list of words

So I have a list of words `wordList = list().'所以我有一个单词列表“wordList = list()”。 Right now, I am counting each letter in each of the words throughout the whole list using this code现在,我正在使用此代码计算整个列表中每个单词中的每个字母

cnt = Counter()
for words in wordList:
      for letters in words:
          cnt[letters]+=1

However, I want it to count differently.但是,我希望它以不同的方式计算。 I want the function to find the most common letter out of all the words in the list, but only by counting each letter per word once (ignoring the fact that some words can have multiple copies of the same letter).我希望 function 从列表中的所有单词中找到最常见的字母,但只能通过对每个单词的每个字母计数一次(忽略某些单词可以有同一个字母的多个副本的事实)。

For example, if the list contained 'happy, harpy and hasty', the two p's in happy should only be counted once.例如,如果列表包含“happy, harpy and hasty”,那么happy 中的两个p 应该只计算一次。 So the function should return a list of the highest frequency letters (in order) without double counting.所以 function 应该返回一个频率最高的字母列表(按顺序),而不需要重复计算。 In the above case it would be 'h, a, p, y, r, s"在上述情况下,它将是“h,a,p,y,r,s”

cnt = Counter()
for words in wordList:
      for letters in set(words):
          cnt[letters]+=1

Add a set call:添加set调用:

cnt = Counter()
for word in wordList:
      for letter in set(word):
          cnt[letter]+=1

An alternative approach using the iterator combinators in itertools :itertools中使用迭代器组合器的另一种方法:

import collections
import itertools

cnt = collections.Counter(itertools.chain.from_iterable(itertools.imap(set, wordList)))
cnt = Counter()
for word in wordList:
    lSet = set(word)
    for letter in lSet:
        cnt[letter] +=1             

You can eliminate a for with update , which updates count from an iterable (in this case, a string):您可以使用update消除for ,它从可迭代(在本例中为字符串)更新计数:

from collections import Counter
words = 'happy harpy hasty'.split()
c=Counter()
for word in words:
    c.update(set(word))
print c.most_common()
print [a[0] for a in c.most_common()]

[('a', 3), ('h', 3), ('y', 3), ('p', 2), ('s', 1), ('r', 1), ('t', 1)]
['a', 'h', 'y', 'p', 's', 'r', 't']

This creates a set from each word and passes them to the constructor of Counter.这会从每个单词创建一个集合并将它们传递给 Counter 的构造函数。

>>> from itertools import chain, imap
>>> from operator import itemgetter
>>> from collections import Counter
>>> words = 'happy', 'harpy', 'hasty'
>>> counter = Counter(chain.from_iterable(imap(set, words)))
>>> map(itemgetter(0), counter.most_common())
['a', 'h', 'y', 'p', 's', 'r', 't']
import collections

cnt = collections.Counter('happy harpy hasty').keys()

cnt = list(cnt)

print(cnt)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM