[英]Python 3.x print number of lines after a specific header
我有一個似乎無法解決的問題; 道歉,如果這是重復但從來沒有真正的答案。 我正在從配置文件中提取特定信息,該文件以文本塊的形式顯示信息,我只需要打印特定的塊,而不需要標題。 所以例如(使用下面的文本格式)我只想捕獲Header2下面的信息,但不想捕獲標題3之外的任何信息:
# output could containmultiple headers, and lines, or no lines per header this is an example of what could be present but it is not absolute.
header1
-------
line1
line2
line3 # can be muiplies availables or known
header2
-------
line1
line2
line3 # can be muiplies availables or known
header3
-------
header4
-------
line1
line2
line3 # can be multiple linnes or none not known
這是我開始使用的代碼,但卡在第二個循環布爾或邏輯上,只打印該標題塊的行:
Raw_file = "scrap.txt"
scrape = open(Raw_file,"r")
for fooline in scrape:
if "Header" in fooline:
#print(fooline) # prints all lines
#print lines under header 2 and stop before header 3
scrape.close()
使用標題行的檢測來打開/關閉控制打印的布爾值:
RAW_FILE = "scrap.txt"
DESIRED = 'header2'
with open(RAW_FILE) as scrape:
printing = False
for line in scrape:
if line.startswith(DESIRED):
printing = True
elif line.startswith('header'):
printing = False
elif line.startswith('-------'):
continue
elif printing:
print(line, end='')
OUTPUT
> python3 test.py
line1
line2
line3 # can be muiplies availables or known
>
根據需要調整。
您可以考慮使用正則表達式將其分解為塊。
如果文件具有可管理的大小,請立即全部閱讀並使用正則表達式:
(^header\d+[\s\S]+?(?=^header|\Z))
把它分成塊。 演示
然后你的Python代碼看起來像這樣(在標題之間獲取任何文本):
import re
with open(fn) as f:
txt=f.read()
for m in re.finditer(r'(^header\d+[\s\S]+?(?=^header|\Z))', txt, re.M):
print(m.group(1))
如果文件大於您想要在一個gulp中讀取的文件,則可以將mmap與正則表達式一起使用,並以相當大的塊讀取文件。
如果您只查找單個標題,則更容易:
m=re.search(r'(^header2[\s\S]+?(?=^header|\Z))', txt, re.M)
if m:
print(m.group(1))
您可以根據匹配的header2
和header3
內容設置啟動和停止收集的標志。
使用example.txt
包含提供的完整示例數據:
f = "example.txt"
scrape = open(f,"r")
collect = 0
wanted = []
for fooline in scrape:
if "header2" in fooline:
collect = 1
if "header3" in fooline:
collect = 2
if collect == 1:
wanted.append(fooline)
elif collect == 2:
break
scrape.close()
wanted
輸出:
['header2\n',
'-------\n',
'line1\n',
'line2\n',
'line3 # can be muiplies availables or known\n',
'\n']
最初,將flag
設置為False
。 檢查該行是否以header2
。 如果為True
,則設置flag
。 如果該行以header3
,則將flag
設置為False
。
如果設置了flag
則打印行。
Raw_file = "scrap.txt"
scrape = open(Raw_file,"r")
flag = False
for fooline in scrape:
if fooline.find("header3") == 0: flag = False # or break
if flag:
print(fooline)
if fooline.find("header2") == 0: flag = True
scrape.close()
輸出:
-------
line1
line2
line3 # can be muiplies availables or known
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.