我有一個txt文件。如何獲取字典鍵值並打印出現在其中的文本行？

Question

我有一個txt文件。 我編寫了代碼，以查找唯一的單詞以及每個單詞在該文件中出現的次數。 我現在需要弄清楚如何打印這些單詞出現的行。 我該怎么做呢？

這是一個示例輸出：分析什么文件：itsy_bitsy_spider.txt
與文件itsy_bitsy_spider.txt的一致性itsy：總計數：2行：1：ITSY Bitsy蜘蛛爬上水嘴線：4：ITSY Bitsy蜘蛛再次爬上水嘴

#this function will get just the unique words without the stop words. 
def openFiles(openFile):

    for i in openFile:
        i = i.strip()
        linelist.append(i)
        b = i.lower()
        thislist = b.split()
        for a in thislist:
            if a in stopwords:
                continue
            else:
                wordlist.append(a)
    #print wordlist




#this dictionary is used to count the number of times each stop 
countdict = {}
def countWords(this_list):
    for word in this_list:
        depunct = word.strip(punctuation)
    if depunct in countdict:
        countdict[depunct] += 1
    else:
        countdict[depunct] = 1

Answer 1

from collections import defaultdict

target = 'itsy'
word_summary = defaultdict(list)
with open('itsy.txt', 'r') as f:
    lines = f.readlines()

for idx, line in enumerate(lines):
    words = [w.strip().lower() for w in line.split()]
    for word in words:
        word_summary[word].append(idx)

unique_words = len(word_summary.keys()) 
target_occurence = len(word_summary[target]) 
line_nums = set(word_summary[target])

print "There are %s unique words." % unique_words 
print "There are %s occurences of '%s'" % (target_occurence, target) 
print "'%s' is found on lines %s" % (target, ', '.join([str(i+1) for i in line_nums]))

Answer 2

如果逐行分析輸入文本文件，則可以維護另一個字典，該字典是單詞-> List <Line>映射。 即為一行中的每個單詞添加一個條目。 可能看起來像以下內容。 請記住，我對python不太熟悉，因此可能缺少一些語法快捷方式。

例如

countdict = {}
linedict = {}
for line in text_file:
    for word in line:
         depunct = word.strip(punctuation)
         if depunct in countdict:
             countdict[depunct] += 1
         else:
             countdict[depunct] = 1

         # add entry for word in the line dict if not there already
         if depunct not in linedict:
             linedict[depunct] = []

         # now add the word -> line entry
         linedict[depunct].append(line)

您可能需要進行的一種修改是，如果單詞在行中出現兩次，則防止重復項添加到行字典中。

上面的代碼假定您只想讀取一次文本文件。

Answer 3

openFile = open("test.txt", "r")

words = {}

for line in openFile.readlines():
  for word in line.strip().lower().split():
    wordDict = words.setdefault(word, { 'count': 0, 'line': set() })
    wordDict['count'] += 1
    wordDict['line'].add(line)

openFile.close()

print words

我有一個txt文件。如何獲取字典鍵值並打印出現在其中的文本行？

問題描述

3 個解決方案

解決方案1
1 2011-11-01 04:01:16

解決方案2
0 2011-11-01 02:48:55

解決方案3
0 2011-11-01 03:32:10

我有一個txt文件。 如何獲取字典鍵值並打印出現在其中的文本行？

問題描述

3 個解決方案

解決方案1 1 2011-11-01 04:01:16

解決方案2 0 2011-11-01 02:48:55

解決方案3 0 2011-11-01 03:32:10

我有一個txt文件。如何獲取字典鍵值並打印出現在其中的文本行？

解決方案1
1 2011-11-01 04:01:16

解決方案2
0 2011-11-01 02:48:55

解決方案3
0 2011-11-01 03:32:10