简体   繁体   English

对于给定的字符串列表,计数单词数

[英]For a given list of string count no of words

I have a given list of string which is:我有一个给定的字符串列表,它是:

strings = ["the method of lagrange multipliers is the economists workhorse for solving optimization problems", "the technique is a centerpiece of economic theory but unfortunately its usually taught poorly"]

Now I want to find the no of word in each sentence so that my output will be现在我想在每个句子中找到单词的编号,这样我的 output 将是

{'the': 2, 'method': 1, 'of': 1, 'lagrange': 1, 'multipliers': 1, 'is': 1, 'economists': 1, 'workhorse': 1, 'for': 1, 'solving': 1, 'optimization': 1, 'problems': 1}

{'the': 1, 'technique': 1, 'is': 1, 'a': 1, 'centerpiece': 1, 'of': 1, 'economic': 1, 'theory': 1, 'but': 1, 'unfortunately': 1, 'its': 1, 'usually': 1, 'taught': 1, 'poorly': 1}

My code is as below:我的代码如下:

from collections import Counter

dataset = ["the method of lagrange multipliers is the economists workhorse for solving optimization problems",
           "the technique is a centerpiece of economic theory but unfortunately its usually taught poorly"]


for index,row in enumerate(dataset):
    word_frequency = dict(Counter(row.split(" ")))
    
print(word_frequency)

With this i am getting output which is:有了这个我得到 output 这是:

{'the': 1, 'technique': 1, 'is': 1, 'a': 1, 'centerpiece': 1, 'of': 1, 'economic': 1, 'theory': 1, 'but': 1, 'unfortunately': 1, 'its': 1, 'usually': 1, 'taught': 1, 'poorly': 1}

Clearly it's only considering the second sentence and counting it but not the first one.显然,它只考虑第二句并计算它,而不是第一句。

Can anyone help me understand what is wrong in my code?谁能帮我理解我的代码有什么问题?

The word_frequency is being updated with every string in the dataset list.In the end, it is storing the Counter for last string in the dataset.Hence, displaying the the Counter for words in the last string. word_frequency正在使用数据集列表中的每个字符串进行更新。最后,它存储数据集中最后一个字符串的计数器。因此,显示最后一个字符串中单词的计数器。 You can use print(word_frequency) inside the for loop or use a list and append the word_frequency to the list each time and once you are out of the loop just print the list .您可以在 for 循环中使用print(word_frequency)或使用list和 append 每次将word_frequency放到列表中,一旦退出循环,只需打印list

from collections import Counter

dataset = ["the method of lagrange multipliers is the economists workhorse for solving optimization problems",
           "the technique is a centerpiece of economic theory but unfortunately its usually taught poorly"]

l = []
for index,row in enumerate(dataset):
    word_frequency = dict(Counter(row.split(" ")))
    l.append(word_frequency)
    
print(l)

Just move the print to be inside your for loop as you're overwriting your calculated word_frequency parameter.当您覆盖计算的 word_frequency 参数时,只需将打印移动到您的for循环内。

Print inside the loop instead of setting a variable inside the loop (which will get overwritten on each iteration before you print it at the end):在循环内打印,而不是在循环内设置变量(在最后打印之前,它将在每次迭代中被覆盖):

>>> dataset = ["the method of lagrange multipliers is the economists workhorse for solving optimization problems",
...            "the technique is a centerpiece of economic theory but unfortunately its usually taught poorly"]
>>> from collections import Counter
>>> for sentence in dataset:
...     print(dict(Counter(sentence.split())))
...
{'the': 2, 'method': 1, 'of': 1, 'lagrange': 1, 'multipliers': 1, 'is': 1, 'economists': 1, 'workhorse': 1, 'for': 1, 'solving': 1, 'optimization': 1, 'problems': 1}
{'the': 1, 'technique': 1, 'is': 1, 'a': 1, 'centerpiece': 1, 'of': 1, 'economic': 1, 'theory': 1, 'but': 1, 'unfortunately': 1, 'its': 1, 'usually': 1, 'taught': 1, 'poorly': 1}

You're overwriting the word_frequency variable inside the for loop, meaning only the final value from the final count is printed.您正在覆盖for循环内的word_frequency变量,这意味着仅打印最终计数的最终值。

You should instead define a Counter outside the for loop, add to the values of that using Counter's add functionality within the for loop, then convert to dict and print at the end:您应该改为在 for 循环之外定义一个Counter ,在 for 循环中使用 Counter 的add功能添加该值,然后转换为 dict 并在最后打印:

from collections import Counter

dataset = ["the method of lagrange multipliers is the economists workhorse for solving optimization problems",
           "the technique is a centerpiece of economic theory but unfortunately its usually taught poorly"]

cumulative_counter = Counter()
for index,row in enumerate(dataset):
  cumulative_counter += Counter(row.split(" "))

word_frequency = dict(cumulative_counter)
print(word_frequency)

Outputs:输出:

{'the': 3, 'method': 1, 'of': 2, 'lagrange': 1, 'multipliers': 1, 'is': 2, 'economists': 1, 'workhorse': 1, 'for': 1, 'solving': 1, 'optimization': 1, 'problems': 1, 'technique': 1, 'a': 1, 'centerpiece': 1, 'economic': 1, 'theory': 1, 'but': 1, 'unfortunately': 1, 'its': 1, 'usually': 1, 'taught': 1, 'poorly': 1}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在 Python 中对给定字符串中的表情符号和单词进行拆分和计数 - Split and count emojis and words in a given string in Python 计算Python中字符串列表中单词列表的频率 - Count frequency of list of words in list of string in Python 给定字符串和(列表)单词,返回包含字符串的单词(最优算法) - Given string and (list) of words, return words that contain string (optimal algorithm) 根据列表中的给定单词拆分字符串 - Split string based on given words from list 返回字符串 python 中单词列表的总数? - Return total count of list of words in a string python? 创建既在给定字符串中又在给定字典中的所有单词的列表 - Creating a list of all words that are both in a given string and in a given dictionary 计算列表元组中每个列表中给定单词的出现次数 - Count occurrences of given words per each list in a tuple of lists 如何计算给定单词列表中的复数单词的数量以找到分数 - How to count the number of plural words given a list of words in order to find the fraction Python - 计算给定文本中的单词 - Python - Count words in a given text 优化给定字符串中单词列表的出现次数(Python) - Optimizing counting occurences of a list of words in a given string (Python)
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM