![](/img/trans.png)
[英]How to extract text part from file using Python & Regular Expressions
[英]Extract part of a text file with python
我有一個文本文件,格式如下:
初始化
背景
開始_測量1
1
2
3
...
100
End_Measurement1
開始_測量2
1
2
3
...
75
End_Measurement2
我想存儲Starting_Measurement1和End_Measurement1之間的值,然后存儲Starting_Measurement2和End_Measurement2之間的值,但值的數量是可變的
在 python 中是否有一種干凈的方法可以做到這一點?
b = textInput.split('\n')
start1 = b.index('Starting_Measurement1')
end1 = b.index('End_Measurement1')
measureList1 = list(b[start1:end1])
並重復其他測量
它可能不是最優雅的解決方案,但它應該做你想做的事:
名為data.txt
的文本文件的內容:
Initialization
Background
Starting_Measurement1
1
2
3
4
5
100
End_Measurement1
Starting_Measurement2
1
2
3
75
End_Measurement2
讀取文件並將數字提取到列表列表中的代碼:
with open('data.txt') as f:
s = f.read()
all_measurements = []
reading = False
for line in s.splitlines():
if line.startswith('Starting_Measurement'):
current_measurements = []
reading = True
elif line.startswith('End_Measurement'):
all_measurements.append(current_measurements)
reading = False
elif reading:
value = int(line)
current_measurements.append(value)
print(all_measurements)
導致:
[[1, 2, 3, 4, 5, 100], [1, 2, 3, 75]]
filename = "measurements.txt"
measurements = []
measurement_lines = []
line_in_measurement = False
with open(filename, "rt") as file:
for line in file:
line = line.strip("\n")
if "Starting_Measurement" in line:
measurement_lines = []
line_in_measurement = True
elif "End_Measurement" in line:
measurements.append(measurement_lines)
line_in_measurement = False
elif line_in_measurement:
measurement_lines.append(line)
for measure_number, measure in enumerate(measurements):
print(f"measurement {measure_number}")
print(measure)
你可以使用re
:
import re
re_numbers = re.compile(
r"Starting_Measurement(1|2)(.*?)End_Measurement(?:1|2)",
re.DOTALL
)
with open("file.txt", "r") as file:
res = {
n: numbers.strip().splitlines()
for n, numbers in re_numbers.findall(file.read())
}
以下文件file.txt
的結果res
Initialization
Background
Starting_Measurement1
1
2
End_Measurement1
Starting_Measurement2
1
2
3
4
5
End_Measurement2
End
是
{'1': ['1', '2'], '2': ['1', '2', '3', '4', '5']}
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.