简体   繁体   English

如何计算以字符串开头的单词数

[英]How to count number of words that start with a string

I'm trying to write a code that counts prefix's,suffix's, and roots. 我正在尝试编写一个计算前缀,后缀和根的代码。 All I need to know is how to count the numbers of words that start or end with a certain string such as 'co'. 我需要知道的是如何计算以某个字符串开头或结尾的单词数,例如'co'。

this is what I have so far. 这就是我到目前为止所拥有的。

SWL=open('mediumWordList.txt').readlines()
  for x in SWL:
      x.lower
      if x.startswith('co'):
          a=x.count(x)
while a==True:
    a=+1
    print a

all I get from this is an infinite loop of ones. 我从中得到的只是一个无限循环。

First of all as a more pythonic way for dealing with files you can use with statement to open the file which close the file automatically at the end of the block. 首先,作为处理文件的更加pythonic方式,您可以使用with语句打开在块结束时自动关闭文件的文件。

Also you don't need to use readlines method to load all the line in memory you can simply loop over the file object. 此外,您不需要使用readlines方法来加载内存中的所有行,您只需循环遍历文件对象即可。

And about the counting the words you need to split your lines to words then use str.stratswith and str.endswith to count the words based on your conditions. 关于计算你需要将你的行分成单词的单词,然后使用str.stratswithstr.endswith根据你的条件计算单词。

So you can use a generator expression within sum function to count the number of your words : 因此,您可以在sum函数中使用生成器表达式来计算单词的数量:

with open('mediumWordList.txt') as f:
   sum(1 for line in f for word in line.split() if word.startswith('co'))

Note that we need to split the line to access the words, if you don't split the lines you'll loop over the all characters of the line. 请注意,我们需要拆分行来访问单词,如果不拆分行,则会循环遍历行的所有字符。

As suggested in comments as a more pythonic way you can use following approach : 正如评论中建议的更加pythonic方式,您可以使用以下方法:

with open('mediumWordList.txt') as f:
   sum(word.startswith('co') for line in f for word in line.split())

You could try to use Counter class from collections. 您可以尝试从集合中使用Counter类。 For example, to count 'Foo' in Bar.txt: 例如,要计算Bar.txt中的'Foo':

    from collections import Counter
    with open('Bar.txt') as barF:
      words = [word for line in barF.readlines() for word in line.split()]
    c = Counter(words)
    print c['Foo']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM