计算Python文本文件中的段落和最常用词

Question

I am trying to count the number of paragraphs and the most frequent words in a text file (any text file for that matter) but seem to have zero output when I run my code, no errors either. 我正在尝试计算文本文件（与此有关的任何文本文件）中的段落数和最常用的词，但是运行代码时似乎输出为零，也没有错误。 Any tips on where I'm going wrong? 关于我要去哪里的任何提示？

filename = input("enter file name: ")
inf = open(filename, 'r')
#frequent words 
wordcount={}
for word in inf.read().split():
 if word not in wordcount:
    wordcount[word] = 1
else:
    wordcount[word] += 1
for key in wordcount.keys():
    print ("%s %s " %(key , wordcount[key]))

#Count Paragraph(s)
linecount = 0
for i in inf:
   paragraphcount = 0
   if '\n' in i:
      linecount += 1
   if len(i) < 2: paragraphcount *= 0
   elif len(i) > 2: paragraphcount = paragraphcount + 1
   print('%-4d %4d %s' % (paragraphcount, linecount, i))  
inf.close()

Answer 1

filename = raw_input("enter file name: ")

wordcount={}
paragraphcount = 0
linecount = 0
with open(filename, 'r') as ftext:

    for line in ftext.readlines():
        if line in ('\n', '\r\n'):
            if linecount == 0:
                paragraphcount = paragraphcount + 1
            linecount = linecount + 1
        else:
            linecount = 0
            #frequent words
            for word in line.split():
                wordcount[word] = wordcount.get(word,0) + 1




print wordcount
print paragraphcount

Answer 2

When you are reading a file, there is a cursor that indicates which byte you are reading at the moment. 当您读取文件时，会有一个光标指示当前正在读取哪个字节。 In your code, you are trying to read the file twice and encountered a strange behavior, which shoud have been a hint that you are doing something wrong. 在您的代码中，您试图读取文件两次，并且遇到了奇怪的行为，这应该暗示您做错了什么。 To the solution, 对于解决方案，

What is the correct way ? 正确的方法是什么？

You should read the file once, store every line, then find word count and paragraph count, using the same store. 您应该阅读一次文件，存储每一行，然后使用同一存储库查找字数和段落数。 Rather than trying to reading it twice. 而不是尝试阅读两次。

What is happening is the current code ? 当前代码是怎么回事？

When you first read the file, your byte cursor is set to the end of the file, when you try to read lines, if returns an empty list because it tries to read the end of the file. 首次读取文件时，当您尝试读取行时，如果将字节游标设置为文件末尾，则它会返回一个空列表，因为它试图读取文件末尾。 You can corrent this by resetting the file pointer(the cursor). 您可以通过重置文件指针（光标）来解决此问题。

Call inf.seek(0) just before you try to read lines. 在尝试读取行之前，请调用inf.seek(0) 。 But instead of this, you should be focusing on implementing a method I mentioned in the first section. 但是，除此以外，您应该专注于实现我在第一部分中提到的方法。

计算Python文本文件中的段落和最常用词

问题描述

2 个解决方案

解决方案1
2 2016-12-09 05:39:15

解决方案2
1 2016-12-08 23:21:56

What is the correct way ? 正确的方法是什么？

What is happening is the current code ? 当前代码是怎么回事？

计算Python文本文件中的段落和最常用词

问题描述

2 个解决方案

解决方案1 2 2016-12-09 05:39:15

解决方案2 1 2016-12-08 23:21:56

What is the correct way ? 正确的方法是什么？

What is happening is the current code ? 当前代码是怎么回事？

解决方案1
2 2016-12-09 05:39:15

解决方案2
1 2016-12-08 23:21:56