[英]How do I count unique words using counter library in python?
im new to python and trying various libraries 我是python的新手,尝试了各种库
from collections import Counter
print(Counter('like baby baby baby ohhh baby baby like nooo'))
When i print this the output I receive is: 当我打印此输出时,我收到的输出是:
Counter({'b': 10, ' ': 8, 'a': 5, 'y': 5, 'o': 4, 'h': 3, 'l': 2, 'i': 2, 'k': 2, 'e': 2, 'n': 1})
But I want to find the count of unique words: 但是我想找到唯一单词的数量:
#output example
({'like': 2, 'baby': 5, 'ohhh': 1, 'nooo': 1}, ('baby', 5))
How can I do this, additionally can I do this without the counter library using loops? 如何做到这一点,另外,如果没有使用循环的计数器库,我可以做到这一点吗?
Using the collections.counter you should first split the string into words like so words = 'like baby baby ohhh so forth'.split()
Then feed the words
variable into the counter. 使用collections.counter首先应该将字符串拆分成单词,使words = 'like baby baby ohhh so forth'.split()
然后将words
变量输入计数器。
Yes you can do it without collections module (counter object). 是的,您可以在没有集合模块(计数器对象)的情况下执行此操作。 There are several ways to do it. 有几种方法可以做到这一点。 One of them, probably not the most efficient one is this: 其中之一,可能不是最有效的一种是:
words = 'like baby baby ohhh so forth'.split()
unique_words = set(words) # converting to set gets rid of duplicates
wordcount ={} # an epmty dict
for word in unique_words:
wordcount[word]=0 # set zero counter for each of the words
for word in words:
wordcount[word]+= 1 # for each occurrence of a word in the list made fro original string, find that key in dict and increment by 1
print(wordcount)
Try this: 尝试这个:
string = 'like baby baby baby ohhh baby baby like nooo'
words = string.split()
result = dict()
for w in words:
if result.get(w) == None:
result[w] = 1
else:
result[w] += 1
for w in result:
print(w + ' -- ' + str(result[w]))
The python Counter class takes an Iterable object as parameter. python Counter类将Iterable对象作为参数。 As you are giving it a String object: 当您给它一个String对象时:
Counter('like baby baby baby ohhh baby baby like nooo')
it will iterate over each character of the string and generate a count for each of the different letters. 它将遍历字符串的每个字符并为每个不同的字母生成一个计数。 Thats why you are receiving 那就是为什么你收到
Counter({'b': 10, ' ': 8, 'a': 5, 'y': 5, 'o': 4, 'h': 3, 'l': 2, 'i': 2, 'k': 2, 'e': 2, 'n': 1})
back from the class. 从班上回来。 One alternative would be to pass a list to Counter. 一种选择是将列表传递给Counter。 This way the Counter class will iterate each of the list elements and create the count you expect. 这样,Counter类将迭代每个列表元素并创建您期望的计数。
Counter(['like', 'baby', 'baby', 'baby', 'ohhh', 'baby', 'baby', 'like', 'nooo'])
That could also be simply achived by splitting the string into words using the split method: 也可以通过使用split方法将字符串拆分为单词来简单地实现:
Counter('like baby baby baby ohhh baby baby like nooo'.split())
Output 输出量
Counter({'baby': 5, 'like': 2, 'ohhh': 1, 'nooo': 1})
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.