如何在文件中查找特定文本，然后在 Python 中找到它時返回其受尊重的文件名？

Question

假設我有一個包含以下內容的文本文件，

f: 1.pdf
t: abc
f: 2.pdf
t: as, as
asd
f: 3.pdf
t: found
f: 4.pdf
t: .,ad
.ads
f: 5.pdf
t: ad
f: 6.pdf
t: ...

我希望我的 python 腳本讀取這個文本文件，如果它找到“找到”這個詞，那么我想將上面的文件名寫入 output 文件。 就像上面的例子一樣，腳本會在 output 文件中寫入 3.pdf 因為它下面有一個單詞“found”。

我認為它需要使用循環和正則表達式來匹配單詞？ 我有一個小想法，但不知道如何開始。

Answer 1

您可以使用此上下文管理器：

with open('text.txt','r') as s, open('output.txt','w') as f:
    lns = s.read().splitlines()
    t = [lns[i-1].split(': ')[1] for i,ln in enumerate(lns) if ln.endswith(': found')]
    f.write('\n'.join(t))

如果你想讓它更清楚：

with open('text.txt','r') as s:
    lines = s.read().splitlines()
    
files = []
for i,line in enumerate(lines):
    if line.endswith(': found'):
        files.append(lines[i-1].split(': ')[1])

with open('output.txt','w') as f:
    f.write('\n'.join(files))

Answer 2

這種建議的方法是基於以下說明，即帶有t:的行將立即跟隨帶有f:的行，並且最好有一個循環遍歷文件的解決方案，而不是將其全部讀入 memory。

在這種情況下不需要正則表達式解析。 唯一復雜的因素是必須考慮成對的線，而不是一次只考慮一條線。 這很容易通過將前一行的值存儲在另一個變量中來解決，該變量在循環結束時從當前行復制，為下一次迭代做好准備。

previous_line = None

with open("myinput") as fin:
    with open("myoutput", "w") as fout:
        for line in fin:
            line = line.strip()
            if (line == "t: found"
                and previous_line is not None
                and previous_line.startswith("f: ")):

                fout.write(previous_line[3:] + "\n")

            previous_line = line

因為該行是用strip預處理的，如果在“找到”之后有任何尾隨空格，這將被刪除。

如何在文件中查找特定文本，然后在 Python 中找到它時返回其受尊重的文件名？

問題描述

2 個解決方案

解決方案1
1 已采納 2020-06-25 20:27:27

您可以使用此上下文管理器：

解決方案2
1 2020-06-25 20:44:23

如何在文件中查找特定文本，然后在 Python 中找到它時返回其受尊重的文件名？

問題描述

2 個解決方案

解決方案1 1 已采納 2020-06-25 20:27:27

您可以使用此上下文管理器：

解決方案2 1 2020-06-25 20:44:23

解決方案1
1 已采納 2020-06-25 20:27:27

解決方案2
1 2020-06-25 20:44:23