[英]Strip out block of text from a file into new file
我需要幫助將文件中的文本塊拆分為單獨的文件。
例子:
ltm data-group internal /Common/www_web {
records {
/images { }
/images/ { }
/test/common/ { }
}
type string
}
ltm monitor http /Common/data {
adaptive disabled
defaults-from /Common/http
destination *:*
interval 1
ip-dscp 0
recv "\{\"status\":\"UP\""
recv-disable "\{\"status\":\"DOWN\""
send {}
time-until-up 0
timeout 4
}
ltm profile http /Common/stage {
adaptive disabled
defaults-from /Common/http
destination *:*
interval 5
ip-dscp 0
recv "\{\"status\":\"UP\""
recv-disable "\{\"status\":\"DOWN\""
send "GET /proxy/test HTTP/1.1\r\nHost: staging\r\nConnection: close\r\n\r\n"
time-until-up 0
timeout 16
}
我想去掉每個塊並將其寫入一個單獨的文件,例如:
ltm data-group internal /Common/www_web {
records {
/images { }
/images/ { }
/test/common/ { }
}
type string
}
到一個單獨的文件中。
ltm monitor http /Common/data {
adaptive disabled
defaults-from /Common/http
destination *:*
interval 1
ip-dscp 0
recv "\{\"status\":\"UP\""
recv-disable "\{\"status\":\"DOWN\""
send {}
time-until-up 0
timeout 4
}
和上面的塊變成一個單獨的塊等等。 到目前為止,我正在嘗試找到一個正則表達式來實現這一點,這是我的代碼:
#!/usr/bin/python
import sys
import re
with open(sys.argv[1], 'r') as f:
contents = f.read()
regex = ur"(^ltm[\s\S]+^ltm)"
matches = re.search(regex, contents, re.MULTILINE)
if matches:
print ("{match}".format(start = matches.start(), end = matches.end(), match = matches.group()))
到目前為止,這個正則表達式捕獲了 'ltm' 文本中的所有內容。 任何幫助,將不勝感激。
我已經研究過使用 python 從文件中剝離一段文本,但對我來說它沒有多大幫助。
您可以使用re.split()
進行簡單的列表理解:
blocks = ["ltm " + s for s in re.split("^ltm ", contents)[1:]]
您也可以將此正則表達式與re.finditer()
(但效率會低得多) :
blocks = [match.group(0) for match in re.finditer("^ltm ((?!^ltm ).)*", contents, re.DOTALL)]
或者這個與re.findall()
:
blocks = re.findall("(^ltm ((?!^ltm ).)*)", contents, re.DOTALL)
您也可以使用相同的實現沒有正則表達式str.find()
克隆str.index()
如果串也沒有發現不的情況下拋出異常):
delimiter = "ltm "
blocks = []
index = contents.find(delimiter)
while index >= 0:
new_index = contents.find(delimiter, index + 1)
if not index or contents[index - 1] in {"\r", "\n"}:
blocks.append(contents[index: new_index] if new_index > 0 else contents[index: ])
index = new_index
我不知道您為每個文件命名什么,您可以更改它以適應您的需要,但這可能會有所幫助。 腳本逐行掃描以找到“itm”,如果找到,將創建一個新文件,名稱為下一個塊計數。
def save_file_name(val):
"""returns a file named after the block count"""
return f'block_{val}.txt'
# opens and stores said file.
file = open('ltm.txt', 'r')
# starting count.
count = 0
# The file that will contain each block.
new_file = open(save_file_name(str(count)), 'w')
# As @Olvin Roght had pointed out, its better to read
# the file line by line in this fashion.
for line in file:
# The part that scans for the wanted keyword.
if line.startswith('ltm'):
# If True will fire this set of code.
# add one to count.
count += 1
# Close the now finished file.
new_file.close()
# create and store a new file.
# Important to note to turn count into a string
new_file = open(save_file_name(str(count)), 'w')
# write the line to the file.
new_file.write(line)
else:
# And If not the wanted keyword, just write
# to the still open file
new_file.write(line)
# Always remeber to close the files that you open.
new_file.close()
file.close()
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.