[英]How to know the number of names that start with each letter of a txt file using pyhton
I need to know how I can calculate the number of words in the list that start with the letter A, B, C .. Z.我需要知道如何计算列表中以字母 A、B、C .. Z 开头的单词数。
Here I leave the reading part of the txt file这里我留下txt文件的阅读部分
#!/usr/bin/python
def main():
lines = []
xs = []
try:
with open("bin-nombres.txt", 'r') as fp:
lines = [lines.strip() for lines in fp]
for i in lines:
print(i[0])
xs = counterByLetter(i[0])
print(xs)
except EOFError as e:
print(e)
finally:
pass
def counterByLetter(data):
return [(k, v) for k, v in {v: data.count(v) for v in 'abcdefghijklmnopqrstuvwxyz'}.items()]
if __name__ == "__main__":
main()
I must calculate the number of words that begin with [A ... Z].我必须计算以 [A ... Z] 开头的单词数。 For examples.举些例子。
Here I leave the solution to the problem.在这里,我留下了问题的解决方案。 Thanking those who helped me !!感谢帮助过我的人!!
import string
def main():
try:
# this initiates the counter with 0 for each letter
letter_count = {letter: 0 for letter in list(string.ascii_lowercase)}
with open("bin-nombres.txt", 'r') as fp:
for line in fp:
line = line.strip()
initial = line[0].lower()
letter_count[initial] += 1 # and here I increment per word
#iterating over the dictionary to get the key and the value.
#In the iteration process the values will be added to know the amount of words.
size = 0
for key , value in letter_count.items():
size += value
print("Names that start with the letter '{}' have {} numbers.".format(key , value))
print("Total names in the file: {}".format(size))
except EOFError as e:
print(e)
if __name__ == "__main__":
main()
Assume that, there have a list name list
which have 3 elements:假设有一个列表名称list
,其中包含 3 个元素:
list = ["Geeks", "For", "Triks"]
And have a array which have 26 elements.并有一个包含 26 个元素的数组。
array = ["0", "0", ......"0", "0"......"0","0"]
array[0]
represent the number of words start with A
. array[0]
表示以A
开头的单词数。 .................. .................. array[25]
represent the number of words start with Z
. ..................................... array[25]
表示以Z
开头的单词数。 Then, if list[n][0]
start with A
then you need to increment array[0]
by 1.然后,如果list[n][0]
以A
开头,那么您需要将array[0]
增加 1。
if array[5] = 7
then it's mean that there are 7 words start with F
.如果array[5] = 7
那么这意味着有 7 个单词以F
开头。 This is the straightforward logic for find the result.这是查找结果的简单逻辑。
So, according to your updated answer (1 word per line, already alphabetically sorted), something like this should work:因此,根据您更新的答案(每行 1 个单词,已经按字母顺序排序),这样的事情应该有效:
import string
def main():
try:
# this initiates our counter with 0 for each letter
letter_count = {letter: 0 for letter in list(string.ascii_lowercase)}
with open("words.txt", 'r') as fp:
for line in fp:
line = line.strip()
initial = line[0].lower()
letter_count[initial] += 1 # and here we increment per word
print(letter_count)
except EOFError as e:
print(e)
if __name__ == "__main__":
main()
UPDATE:更新:
It's good that you don't just want a readymade solution, but your code has a few issues and some points are not super pythonic, that's why I suggested to do it as above.很高兴您不仅想要现成的解决方案,而且您的代码有一些问题,有些点不是超级 pythonic,这就是为什么我建议按上述方式进行。 If you really want to go with your solution, you need to fix your counterByLetter
function.如果您真的想使用您的解决方案,您需要修复您的counterByLetter
函数。 The problem with it is that you're not actually storing the results anywhere, you're always returning a new array of results for each word.它的问题在于您实际上并没有将结果存储在任何地方,您总是为每个单词返回一个新的结果数组。 You probably have a word starting with 'z' as the last word of the file, hence the result having 0
as the count for all letters, except 'z', which has one.您可能有一个以 'z' 开头的单词作为文件的最后一个单词,因此结果中所有字母的计数为0
,除了 'z' 为 1。 You need to update your values for the current letter in that function, instead of calculating the whole array at once.您需要更新该函数中当前字母的值,而不是一次计算整个数组。
I'd suggest to change a bit your code like this.我建议像这样改变你的代码。
Use collection.defaultdic t set to int as value: using the first letter as key of the dictionary you are able to increment its value each there is a match.使用collection.defaultdic t设置为int作为值:使用第一个字母作为字典的键,您可以在每次匹配时增加其值。 So:所以:
from collections import defaultdict
Set xs
as xs = defaultdict(int)
将xs
设置为xs = defaultdict(int)
Change the for i in lines:
body to将for i in lines:
body 更改for i in lines:
for i in lines:
xs[i[0]] += 1
If you print xs
at the end of the for
loop you'll get something like:如果您在for
循环结束时打印xs
,您将得到如下结果:
defaultdict(<class 'int'>, {'P': 3, 'G': 2, 'R': 2})
Keys in dict are case sensitive, so, take care of transforming the case, if required. dict 中的键区分大小写,因此,如果需要,请注意转换大小写。
You don't need an external method to do the counting job.您不需要外部方法来完成计数工作。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.