[英]extract specific text from several metadata files using python
如何從下面的文本中提取WESTBOUNDINGCOORDINATE,NORTHBOUNDINGCOORDINATE,EASTBOUNDINGCOORDINATE和SOUTHBOUNDINGCOORDINATE? 但是,所有圖元文件的同一行中都沒有文本,例如,其中一個很好
WESTBOUNDINGCOORDINATE在第2行,但文件2在第4行。請幫助...
GROUP = BOUNDINGRECTANGLE
OBJECT = WESTBOUNDINGCOORDINATE
NUM_VAL = 1
VALUE = 80.8290376770946
END_OBJECT = WESTBOUNDINGCOORDINATE
OBJECT = NORTHBOUNDINGCOORDINATE
NUM_VAL = 1
VALUE = 39.9999999964079
END_OBJECT = NORTHBOUNDINGCOORDINATE
OBJECT = EASTBOUNDINGCOORDINATE
NUM_VAL = 1
VALUE = 104.443461525786
END_OBJECT = EASTBOUNDINGCOORDINATE
OBJECT = SOUTHBOUNDINGCOORDINATE
NUM_VAL = 1
VALUE = 29.9999999973059
END_OBJECT = SOUTHBOUNDINGCOORDINATE
END_GROUP = BOUNDINGRECTANGLE
我的代碼:
metafiles = glob.glob("D://*.txt")
for f in metafiles:
with open (f, 'r') as infile:
lines = infile.readlines()
WESTBOUNDINGCOORDINATE = lines[4][29:45]
print (WESTBOUNDINGCOORDINATE)
問題在於WESTBOUNDINGCOORDINATE值並不總是在同一行中。
嘗試遍歷文件,忽略所有空行,並查找以字符串"OBJECT"
開頭並以所需坐標結尾的行。
例如:
def parse(filepath):
with open(filepath) as f:
contents = f.readlines()
output = {}
group = {}
inside_group = False
for line in contents:
line = line.strip()
if line == '':
continue
type, value = line.split('=')
type = type.strip()
value = value.strip()
if type == 'OBJECT':
inside_group = True
elif type == 'END_OBJECT':
output[value] = group
inside_group = False
group = {}
elif inside_group:
group[type] = value
return output
這應該以以下形式返回字典:
>>> parse('file1.txt')
{
"WESTBOUNDINGCOORDINATE": {
"NUM_VAL": 1,
"VALUE": 80.829037677094
},
"NORTHBOUNDINGCOORDINATE": {
"NUM_VAL": 1,
"VALUE": 39.9999999964079
},
# etc
}
然后,您可以從詞典中獲取所需的任何坐標。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.