附加列表后清空輸出

Question

r = ","
x = ""
output = list()
import string

def find_word(filepath,keyword):
    doc = open(filepath, 'r')

    for line in doc:
        #Remove all the unneccessary characters
        line = line.replace("'", r)
        line = line.replace('`', r)
        line = line.replace('[', r)
        line = line.replace(']', r)
        line = line.replace('{', r)
        line = line.replace('}', r)
        line = line.replace('(', r)
        line = line.replace(')', r)
        line = line.replace(':', r)
        line = line.replace('.', r)
        line = line.replace('!', r)
        line = line.replace('?', r)
        line = line.replace('"', r)
        line = line.replace(';', r)
        line = line.replace(' ', r)
        line = line.replace(',,', r)
        line = line.replace(',,,', r)
        line = line.replace(',,,,', r)
        line = line.replace(',,,,,', r)
        line = line.replace(',,,,,,', r)
        line = line.replace(',,,,,,,', r)
        line = line.replace('#', r)
        line = line.replace('*', r)
        line = line.replace('**', r)
        line = line.replace('***', r)

        #Make the line lowercase
        line = line.lower()

        #Split the line after every r (comma) and name the result "word"
        words = line.split(r)

        #if the keyword (also in lowercase form) appears in the before created words list
        #then append the list output by the whole line in which the keyword appears

        if keyword.lower() in words:
            output.append(line)

    return output

print find_word("pg844.txt","and")

這段代碼的目標是在文本文件中搜索某個關鍵字，比如“和”，然后將找到關鍵字的整行放入類型（int，string）的列表中。 int應該是行號和上面提到的整個行的字符串。

我還在編寫行號 - 所以還沒有問題。 但問題是：輸出是空的。 即使我附加一個隨機字符串而不是該行，我也沒有得到任何結果。

如果我使用

if keyword.lower() in words:
        print line

我得到所有想要的行，其中出現關鍵字。 但我無法將其納入輸出列表。

我試圖搜索的文本文件： http ： //www.gutenberg.org/cache/epub/844/pg844.txt

Answer 1

請使用正則表達式。 請參閱Python中的Regex文檔。 替換每個字符/字符集都令人困惑。 列表和.append()看起來是正確的，但也許可以考慮在for-loop中調試你的line變量，偶爾打印它以確保它的值是你想要的。

pyInProgress的答案對全局變量提出了一個很好的觀點，雖然沒有測試它，但我不相信如果使用output返回變量而不是全局output變量則需要它。 如果您需要有關全局變量的更多信息，請參閱此StackOverflow帖子。

Answer 2

循環遍歷string.punctuation以在遍歷行之前刪除所有內容

import string, re

r = ','

def find_word(filepath, keyword):

    output = []
    with open(filepath, 'rb') as f:
        data = f.read()
        for x in list(string.punctuation):
            if x != r:
                data = data.replace(x, '')
        data = re.sub(r',{2,}', r, data, re.M).splitlines()

    for i, line in enumerate(data):
        if keyword.lower() in line.lower().split(r):
            output.append((i, line))
    return output

print find_word('pg844.txt', 'and')

Answer 3

由於output = list()位於代碼的頂層而不在函數內部，因此它被視為全局變量。 要編輯函數中的全局變量，必須首先使用global關鍵字。

例：

gVar = 10

def editVar():
    global gVar
    gVar += 5

因此，要在函數find_word()編輯變量output ，必須在為其賦值之前鍵入global output 。

它應該如下所示：

r = ","
x = ""
output = list()
import string

def find_word(filepath,keyword):
    doc = open(filepath, 'r')

    for line in doc:
        #Remove all the unneccessary characters
        line = line.replace("'", r)
        line = line.replace('`', r)
        line = line.replace('[', r)
        line = line.replace(']', r)
        line = line.replace('{', r)
        line = line.replace('}', r)
        line = line.replace('(', r)
        line = line.replace(')', r)
        line = line.replace(':', r)
        line = line.replace('.', r)
        line = line.replace('!', r)
        line = line.replace('?', r)
        line = line.replace('"', r)
        line = line.replace(';', r)
        line = line.replace(' ', r)
        line = line.replace(',,', r)
        line = line.replace(',,,', r)
        line = line.replace(',,,,', r)
        line = line.replace(',,,,,', r)
        line = line.replace(',,,,,,', r)
        line = line.replace(',,,,,,,', r)
        line = line.replace('#', r)
        line = line.replace('*', r)
        line = line.replace('**', r)
        line = line.replace('***', r)

        #Make the line lowercase
        line = line.lower()

        #Split the line after every r (comma) and name the result "word"
        words = line.split(r)

        #if the keyword (also in lowercase form) appears in the before created words list
        #then append the list output by the whole line in which the keyword appears

        global output
        if keyword.lower() in words:
            output.append(line)

    return output

在將來，除非你絕對需要，否則盡量遠離全局變量。 他們會變得凌亂！

附加列表后清空輸出

問題描述

3 個解決方案

解決方案1
2 2015-10-20 18:10:20

解決方案2
1 2015-10-20 18:12:21

解決方案3
0 已采納 2015-10-20 18:11:42

附加列表后清空輸出

問題描述

3 個解決方案

解決方案1 2 2015-10-20 18:10:20

解決方案2 1 2015-10-20 18:12:21

解決方案3 0 已采納 2015-10-20 18:11:42

解決方案1
2 2015-10-20 18:10:20

解決方案2
1 2015-10-20 18:12:21

解決方案3
0 已采納 2015-10-20 18:11:42