[英]extract specific text from several metadata files using python
How to extract WESTBOUNDINGCOORDINATE, NORTHBOUNDINGCOORDINATE, EASTBOUNDINGCOORDINATE, and SOUTHBOUNDINGCOORDINATE from the text below? 如何从下面的文本中提取WESTBOUNDINGCOORDINATE,NORTHBOUNDINGCOORDINATE,EASTBOUNDINGCOORDINATE和SOUTHBOUNDINGCOORDINATE? However, all metafiles do not have the texts in the same line, for example, fine one has 但是,所有图元文件的同一行中都没有文本,例如,其中一个很好
WESTBOUNDINGCOORDINATE in line 2 but file two has it in line 4. Please help... WESTBOUNDINGCOORDINATE在第2行,但文件2在第4行。请帮助...
GROUP = BOUNDINGRECTANGLE
OBJECT = WESTBOUNDINGCOORDINATE
NUM_VAL = 1
VALUE = 80.8290376770946
END_OBJECT = WESTBOUNDINGCOORDINATE
OBJECT = NORTHBOUNDINGCOORDINATE
NUM_VAL = 1
VALUE = 39.9999999964079
END_OBJECT = NORTHBOUNDINGCOORDINATE
OBJECT = EASTBOUNDINGCOORDINATE
NUM_VAL = 1
VALUE = 104.443461525786
END_OBJECT = EASTBOUNDINGCOORDINATE
OBJECT = SOUTHBOUNDINGCOORDINATE
NUM_VAL = 1
VALUE = 29.9999999973059
END_OBJECT = SOUTHBOUNDINGCOORDINATE
END_GROUP = BOUNDINGRECTANGLE
My code: 我的代码:
metafiles = glob.glob("D://*.txt")
for f in metafiles:
with open (f, 'r') as infile:
lines = infile.readlines()
WESTBOUNDINGCOORDINATE = lines[4][29:45]
print (WESTBOUNDINGCOORDINATE)
The problem is that WESTBOUNDINGCOORDINATE value is not always in the same line. 问题在于WESTBOUNDINGCOORDINATE值并不总是在同一行中。
Try iterating through the file, ignoring all empty lines, and looking for lines which begin with the string "OBJECT"
and end with the coordinate you want. 尝试遍历文件,忽略所有空行,并查找以字符串"OBJECT"
开头并以所需坐标结尾的行。
For example: 例如:
def parse(filepath):
with open(filepath) as f:
contents = f.readlines()
output = {}
group = {}
inside_group = False
for line in contents:
line = line.strip()
if line == '':
continue
type, value = line.split('=')
type = type.strip()
value = value.strip()
if type == 'OBJECT':
inside_group = True
elif type == 'END_OBJECT':
output[value] = group
inside_group = False
group = {}
elif inside_group:
group[type] = value
return output
This should return a dictionary in the form: 这应该以以下形式返回字典:
>>> parse('file1.txt')
{
"WESTBOUNDINGCOORDINATE": {
"NUM_VAL": 1,
"VALUE": 80.829037677094
},
"NORTHBOUNDINGCOORDINATE": {
"NUM_VAL": 1,
"VALUE": 39.9999999964079
},
# etc
}
You can then grab whichever coordinate you need from the dictionary. 然后,您可以从词典中获取所需的任何坐标。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.