简体   繁体   English

计算文本文件中的单词总数

[英]counting total number of words in a text file

I am new to python and trying to print the total number of words in a text file and the total number of specific words in the file provided by the user. 我是python的新手,尝试打印文本文件中的单词总数以及用户提供的文件中特定单词的总数。

I tested my code, but results output of single word,but i need only the overall word count of all the words in the file and also the overall wordcount of words provided by the user. 我测试了我的代码,但是输出了单个单词,但是我只需要文件中所有单词的总单词计数,还需要用户提供的单词的总单词计数。

Code: 码:

name = raw_input("Enter the query x ")
name1 = raw_input("Enter the query y ")
file=open("xmlfil.xml","r+")
wordcount={}
for word in file.read().split():
    if word not in wordcount:
        wordcount[word] = 1
    else:
        wordcount[word] += 1
for k,v in wordcount.items():
    print k, v

for name in file.read().split():
    if name not in wordcount:
        wordcount[name] = 1
    else:
        wordcount[name] += 1
for k,v in wordcount.items():
    print k, v

for name1 in file.read().split():
    if name1 not in wordcount:
        wordcount[name1] = 1
    else:
        wordcount[name1] += 1
for k,v in wordcount.items():
    print k, v
MyFile=open('test.txt','r')
words={}
count=0
given_words=['The','document','1']
for x in MyFile.read().split():
    count+=1
    if x in given_words:
        words.setdefault(x,0)
        words[str(x)]+=1    
MyFile.close()
print count, words

Sample output 样品输出

17 {'1': 1, 'The': 1, 'document': 1} 17 {'1':1,'The':1,'document':1}

Please do not name the variable to handle open() result file as then you'll overwrite the constructor function for the file type. 请不要命名变量来处理open()结果file因为那样的话,您将覆盖file类型的构造函数。

You can get what you need easily via Counter 您可以通过Counter轻松获得所需的东西

from collections import Counter

c = Counter()
with open('your_file', 'rb') as f:
    for ln in f:
        c.update(ln.split())

total = sum(c.values())
specific = c['your_specific_word']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM