[英]Find how many words start with certain letter in a list
I am trying to output the total of how many words start with a letter 'a'
in a list from a separate text file. 我试图从单独的文本文件中输出列表中以字母
'a'
开头'a'
单词总数。 I'm looking for an output such as this. 我正在寻找这样的输出。
35 words start with a letter 'a'.
However, i'm outputting all the words that start with an 'a'
instead of the total with my current code. 但是,我正在输出以
'a'
开头的所有单词,而不是当前代码中的全部单词。 Should I be using something other than a for loop? 我是否应该使用for循环以外的其他方式?
So far, this is what I have attempted: 到目前为止,这是我尝试过的:
wordsFile = open("words.txt", 'r')
words = wordsFile.read()
wordsFile.close()
wordList = words.split()
print("Words:",len(wordList)) # prints number of words in the file.
a_words = 0
for a_words in wordList:
if a_words[0]=='a':
print(a_words, "start with the letter 'a'.")
The output I'm getting thus far: 到目前为止,我得到的输出是:
Words: 334
abate start with the letter 'a'.
aberrant start with the letter 'a'.
abeyance start with the letter 'a'.
and so on. 等等。
You could replace this with a sum
call in which you feed 1
for every word in wordList
that starts with a
: 你可以用替换此
sum
通话中你喂1
中的每一个字wordList
是开头a
:
print(sum(1 for w in wordList if w.startswith('a')), 'start with the letter "a"')
This can be further trimmed down if you use the boolean values returned by startswith
instead, since True
is treated as 1
in these contexts the effect is the same: 如果您使用
startswith
返回的布尔值来代替,则可以进一步缩小,因为在这些情况下, True
被视为1
,因此效果是相同的:
print(sum(w.startswith('a') for w in a), 'start with the letter "a"')
With your current approach, you're not summing anything, you're simply printing any word that matches. 使用当前的方法,您无需求和,仅打印任何匹配的单词。 In addition, you're re-naming
a_word
from an int
to the contents of the list as you iterate through it. 另外,您在迭代时将
a_word
从一个int
重命名为列表的内容。
Also, instead of using a_word[0]
to check for the first character, you could use startswith(character)
which has the same effect and is a bit more readable. 另外,您可以使用
startswith(character)
来代替第一个字符,而不用使用a_word[0]
来检查第一个字符,该命令具有相同的效果并且可读性更高。
You are using the a_words
as the value of the word in each iteration and missing a counter. 您在每次迭代中都使用
a_words
作为单词的值,并且缺少计数器。 If we change the for loop to have words
as the value and reserved a_words
for the counter, we can increment the counter each time the criteria is passed. 如果我们更改for循环以将
words
作为值并为计数器保留a_words
,则每次通过标准时,我们都可以递增计数器。 You could change a_words
to wordCount
or something generic to make it more portable and friendly for other letters. 您可以将
a_words
更改为wordCount
或其他通用名称,以使其对其他字母更易于携带和友好。
a_words = 0
for words in wordList:
if words[0]=='a':
a_words += 1
print(a_words, "start with the letter 'a'.")
sum(generator)
is a way to go, but for completeness sake, you may want to do it with list comprehension (maybe if it's slightly more readable or you want to do something with words starting with a etc.). sum(generator)
是一种可行的方法,但是出于完整性考虑,您可能希望通过列表理解来实现(也许可读性更高,或者您想要对以等开头的单词进行处理)。
words_starting_with_a = [word for word in word_list if word.startswith('a')]
After that you may use len
built-in to retrieve length of your new list. 之后,您可以使用内置的
len
来检索新列表的长度。
print(len(words_starting_with_a), "words start with a letter 'a'")
Simple alternative solution using re.findall
function(without splitting text and for
loop): 使用
re.findall
函数的简单替代解决方案(不拆分文本和for
循环):
import re
...
words = wordsFile.read()
...
total = len(re.findall(r'\ba\w+?\b', words))
print('Total number of words that start with a letter "a" : ', total)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.