简体   繁体   中英

Count unique words and create dict with word and count in Python

I need help making a function called strcount(S) that returns a dictionary with words as keys and the number of times that a word appears as the corresponding value. The output should be something like this:

strcount("a a a a b b")
{'a': 4, 'b': 2}
strcount("one")
{'one': 1}
sorted(strcount("this one and that one for one time").items())
[('and', 1), ('for', 1), ('one', 3), ('that', 1), ('this', 1), ('time', 1)]

The most Pythonic solution would be to use collections.Counter :

>>> from collections import Counter
>>> Counter("this one and that one for one time".split()).items()
[('and', 1), ('for', 1), ('that', 1), ('this', 1), ('one', 3), ('time', 1)]

If you want to write your own solution, I would try something like this:

  1. Split up the string into a list of words. You can use .split() for this.
  2. Construct a dictionary where each key is one word and the value is 0 .
  3. Iterate over your list of words. For every word, add 1 to your_dict[word] .

Alternatively, you can implement your own algorithm without using Counter .

def countwords(A):  
    dic = {}  
    for item in A.split():  
       if dic.has_key(item):  
           dic[item] += 1  
       else:  
           dic[item] = 1  

    return sorted(dic.items())  # return sorted list.

If you are using Python 3.x replace the following line:

if dic.has_key(item):

with:

if item in dic:

Output:

>>> print (countwords("this one and that one for one time"))
[('and', 1), ('for', 1), ('one', 3), ('that', 1), ('this', 1), ('time', 1)]

@Blender's answer using Counter is great, but its for Python versions 2.7 and above.

Here is an alternate solution that works for lower versions of Python:

from collections import defaultdict

word_freq = defaultdict(int)
for i in "this one and that one for this one".split():
   word_freq[i] += 1

This will give you:

>>> word_freq
defaultdict(<type 'int'>, {'this': 2, 'and': 1, 'that': 1, 'for': 1, 'one': 3})
>>> word_freq['one']
3

I would do that like this:

def strcount(input):
    d = dict()
    for word in input:
        if word not in d:
            d[word] = 1
        else:
            d[word] += 1
    return d 

It is a simple way that I use, and that would work for you too. Maybe not the fastest but definitely works and is simple.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM