Python - 計算文本文件中的單詞

Question

我是Python的新手，正在開發一個程序，它將計算簡單文本文件中的單詞實例。 程序和文本文件將從命令行中讀取，因此我已將其包含在我的編程語法中以檢查命令行參數。 代碼如下

import sys

count={}

with open(sys.argv[1],'r') as f:
    for line in f:
        for word in line.split():
            if word not in count:
                count[word] = 1
            else:
                count[word] += 1

print(word,count[word])

file.close()

count是一個字典，用於存儲單詞及其出現次數。 我希望能夠打印出每個單詞及其出現的次數，從大多數事件開始到最少出現。

我想知道我是否在正確的軌道上，如果我正確使用系統。 謝謝！！

Answer 1

你做的對我來說很好，也可以使用collections.Counter （假設你是python 2.7或更新版本）來獲取更多的信息，比如每個單詞的數量。 我的解決方案看起來像這樣，可能會有一些改進。

import sys
from collections import Counter
lines = open(sys.argv[1], 'r').readlines()
c = Counter()
for line in lines:
    for work in line.strip().split():
        c.update(work)
for ind in c:
    print ind, c[ind]

Answer 2

您的最終print沒有循環，因此它只會打印您讀取的最后一個單詞的計數，這仍然是word的值。

此外，使用with context manager，您不需要close()文件句柄。

最后，正如評論中指出的那樣，您需要在split之前從每line刪除最終換行符。

對於像這樣的簡單程序，它可能不值得麻煩，但您可能希望查看Collections中的defaultdict以避免在字典中初始化新鍵的特殊情況。

Answer 3

我剛剛注意到一個拼寫錯誤：你打開文件為f但你把它關閉為file 。 正如tripleee所說，您不應該關閉在with語句中打開的文件。 此外，使用內置函數的名稱（如file或list ）作為您自己的標識符也是不好的做法。 有時它有效，但有時它會導致討厭的錯誤。 對於閱讀代碼的人來說，這讓人感到困惑; 語法高亮編輯器可以幫助避免這個小問題。

要按照count的降序打印count字典中的數據，您可以執行以下操作：

items = count.items()
items.sort(key=lambda (k,v): v, reverse=True)
print '\n'.join('%s: %d' % (k, v) for k,v in items)

有關list.sort（）方法和其他方便的dict方法的更多詳細信息，請參閱Python Library Reference。

Answer 4

我只是通過使用re庫來做到這一點。 這是每行文本文件中的平均單詞，但您必須找出每行的單詞數。

import re
#this program get the average number of words per line
def main():
    try:
        #get name of file
        filename=input('Enter a filename:')

        #open the file
        infile=open(filename,'r')

        #read file contents
        contents=infile.read()
        line = len(re.findall(r'\n', contents))
        count = len(re.findall(r'\w+', contents))
        average = count // line

        #display fie contents
        print(contents)
        print('there is an average of', average, 'words per sentence')

        #closse the file
        infile.close()
    except IOError:
        print('An error oocurred when trying to read ')
        print('the file',filename )

#call main
main()

Python - 計算文本文件中的單詞

問題描述

4 個解決方案

解決方案1
3 已采納 2014-09-11 03:17:29

解決方案2
0 2014-09-11 03:40:31

解決方案3
0 2014-09-11 04:02:12

解決方案4
0 2017-11-13 01:10:11

Python - 計算文本文件中的單詞

問題描述

4 個解決方案

解決方案1 3 已采納 2014-09-11 03:17:29

解決方案2 0 2014-09-11 03:40:31

解決方案3 0 2014-09-11 04:02:12

解決方案4 0 2017-11-13 01:10:11

解決方案1
3 已采納 2014-09-11 03:17:29

解決方案2
0 2014-09-11 03:40:31

解決方案3
0 2014-09-11 04:02:12

解決方案4
0 2017-11-13 01:10:11