简体   繁体   中英

Python- Count each letter in a list of words

So I have a list of words `wordList = list().' Right now, I am counting each letter in each of the words throughout the whole list using this code

cnt = Counter()
for words in wordList:
      for letters in words:
          cnt[letters]+=1

However, I want it to count differently. I want the function to find the most common letter out of all the words in the list, but only by counting each letter per word once (ignoring the fact that some words can have multiple copies of the same letter).

For example, if the list contained 'happy, harpy and hasty', the two p's in happy should only be counted once. So the function should return a list of the highest frequency letters (in order) without double counting. In the above case it would be 'h, a, p, y, r, s"

cnt = Counter()
for words in wordList:
      for letters in set(words):
          cnt[letters]+=1

Add a set call:

cnt = Counter()
for word in wordList:
      for letter in set(word):
          cnt[letter]+=1

An alternative approach using the iterator combinators in itertools :

import collections
import itertools

cnt = collections.Counter(itertools.chain.from_iterable(itertools.imap(set, wordList)))
cnt = Counter()
for word in wordList:
    lSet = set(word)
    for letter in lSet:
        cnt[letter] +=1             

You can eliminate a for with update , which updates count from an iterable (in this case, a string):

from collections import Counter
words = 'happy harpy hasty'.split()
c=Counter()
for word in words:
    c.update(set(word))
print c.most_common()
print [a[0] for a in c.most_common()]

[('a', 3), ('h', 3), ('y', 3), ('p', 2), ('s', 1), ('r', 1), ('t', 1)]
['a', 'h', 'y', 'p', 's', 'r', 't']

This creates a set from each word and passes them to the constructor of Counter.

>>> from itertools import chain, imap
>>> from operator import itemgetter
>>> from collections import Counter
>>> words = 'happy', 'harpy', 'hasty'
>>> counter = Counter(chain.from_iterable(imap(set, words)))
>>> map(itemgetter(0), counter.most_common())
['a', 'h', 'y', 'p', 's', 'r', 't']
import collections

cnt = collections.Counter('happy harpy hasty').keys()

cnt = list(cnt)

print(cnt)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM