简体   繁体   English

计算独特的单词并用Python创建单词和计数字典

[英]Count unique words and create dict with word and count in Python

I need help making a function called strcount(S) that returns a dictionary with words as keys and the number of times that a word appears as the corresponding value. 我需要帮助创建一个名为strcount(S)的函数,该函数返回一个字典作为键的字典,以及一个单词作为相应值出现的次数。 The output should be something like this: 输出应该是这样的:

strcount("a a a a b b")
{'a': 4, 'b': 2}
strcount("one")
{'one': 1}
sorted(strcount("this one and that one for one time").items())
[('and', 1), ('for', 1), ('one', 3), ('that', 1), ('this', 1), ('time', 1)]

The most Pythonic solution would be to use collections.Counter : 最Pythonic的解决方案是使用collections.Counter

>>> from collections import Counter
>>> Counter("this one and that one for one time".split()).items()
[('and', 1), ('for', 1), ('that', 1), ('this', 1), ('one', 3), ('time', 1)]

If you want to write your own solution, I would try something like this: 如果你想编写自己的解决方案,我会尝试这样的事情:

  1. Split up the string into a list of words. 将字符串拆分为单词列表。 You can use .split() for this. 你可以使用.split()
  2. Construct a dictionary where each key is one word and the value is 0 . 构造一个字典,其中每个键是一个单词,值为0
  3. Iterate over your list of words. 迭代你的单词列表。 For every word, add 1 to your_dict[word] . 对于每个单词,将1添加到your_dict[word]

Alternatively, you can implement your own algorithm without using Counter . 或者,您可以在不使用Counter的情况下实现自己的算法。

def countwords(A):  
    dic = {}  
    for item in A.split():  
       if dic.has_key(item):  
           dic[item] += 1  
       else:  
           dic[item] = 1  

    return sorted(dic.items())  # return sorted list.

If you are using Python 3.x replace the following line: 如果您使用的是Python 3.x,请替换以下行:

if dic.has_key(item):

with: 有:

if item in dic:

Output: 输出:

>>> print (countwords("this one and that one for one time"))
[('and', 1), ('for', 1), ('one', 3), ('that', 1), ('this', 1), ('time', 1)]

@Blender's answer using Counter is great, but its for Python versions 2.7 and above. @Blender使用Counter的答案很棒,但它适用于Python 2.7及以上版本。

Here is an alternate solution that works for lower versions of Python: 这是一个适用于较低版本Python的替代解决方案:

from collections import defaultdict

word_freq = defaultdict(int)
for i in "this one and that one for this one".split():
   word_freq[i] += 1

This will give you: 这会给你:

>>> word_freq
defaultdict(<type 'int'>, {'this': 2, 'and': 1, 'that': 1, 'for': 1, 'one': 3})
>>> word_freq['one']
3

I would do that like this: 我会这样做:

def strcount(input):
    d = dict()
    for word in input:
        if word not in d:
            d[word] = 1
        else:
            d[word] += 1
    return d 

It is a simple way that I use, and that would work for you too. 这是我使用的一种简单方法,这对你也有用。 Maybe not the fastest but definitely works and is simple. 也许不是最快但绝对有效并且很简单。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM