[英]How to extract specific lines from a file and save specific lines into each new file in python
[英]Extract specific records from a text file and save to a new file in Python
我有一個包含數千個收據信息的 txt 文件。 有兩種類型:
我需要獲取所有摘要收據及其內容,並將它們寫入一個新文件。
以下是我到目前為止所做的,但它所做的只是將所有內容復制到一個新文件中。
filtered = []
with open("sample.txt", "r+") as file:
for line in file:
filtered.append(line.split("""
Company Name
A CITY
Name of CITY
Tin:00000
#10000
N#00108235 Cashier ID#0000
- - - - - - - - - - - - - - - - - - - -
Report(X-Report)
"""))
outputfile = open("output.txt","w")
for lines in filtered:
outputfile.write(str(lines))
我對 python 很陌生,非常感謝提示或指導。 TIA
您只需要根據類型將它們分開嗎? 根據您的解釋,簡單的解決方案是讀取文件的內容並在該文件中找到單詞“SUMMARY OF CHARGES”,如果找到,則將內容保存到新文件中。 任何帶有單詞abc
的正則表達式將是.*abc.*
如果您有單個收據的單個文件,則代碼將是這樣的。
import re
with open("sample.txt","r") as sfile:
cont=sfile.read()
if (re.match(".*SUMMARY OF CHARGES.*",cont)):
with open("outfile.txt","w") as outfile:
outfile.write(cont)
要分隔各個收據的內容,您可以使用正則表達式組。 使正則表達式只保留一張收據,然后創建一個組 (your_regex)* 然后遍歷該組以獲取所有匹配的收據。
首先,我們可以將整個文件拆分成這樣的配方列表。
with open("sample.txt", "r+") as file:
receipts = file.read()
# We convert it to a list of receipts
receipts = receipts.split("- - - - -") #<=== This should be tweak to ensure that we split all receipt. You can also use "FROM THE DATE PERMIT TO USE"
然后我們過濾小時列表,我們在食譜列表中是獨一無二的。
my_filter = lambda receipt: "SUMMARY OF CHARGE" in receipt
summaries = list(filter(my_filter, receipts))
with open("out.txt", "a") as outfile:
for summary in summaries:
outfile.write(summary)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.