I need to filter an XML file for certain values, if the node contains this value, the node should be removed.
<?xml version="1.0" encoding="utf-8" ?>
<ogr:FeatureCollection
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://ogr.maptools.org/ TZwards.xsd"
xmlns:ogr="http://ogr.maptools.org/"
xmlns:gml="http://www.opengis.net/gml">
<gml:boundedBy></gml:boundedBy>
<gml:featureMember>
<ogr:TZwards fid="F0">
<ogr:Region_Nam>TARGET</ogr:Region_Nam>
<ogr:District_N>Kondoa</ogr:District_N>
<ogr:Ward_Name>Bumbuta</ogr:Ward_Name>
</ogr:TZwards>
</gml:featureMember>
<gml:featureMember>
<ogr:TZwards fid="F1">
<ogr:Region_Nam>REMOVE</ogr:Region_Nam>
<ogr:District_N>Kondoa</ogr:District_N>
<ogr:Ward_Name>Pahi</ogr:Ward_Name>
</ogr:TZwards>
</gml:featureMember>
</ogr:FeatureCollection>
The Python script should keep the <gml:featureMember>
node if the <ogr:Region_Nam>
contains TARGET
and remove all other nodes.
from xml.dom import minidom
import xml.etree.ElementTree as ET
tree = ET.parse('input.xml').getroot()
removeList = list()
for child in tree.iter('gml:featureMember'):
if child.tag == 'ogr:TZwards':
name = child.find('ogr:Region_Nam').text
if (name == 'TARGET'):
removeList.append(child)
for tag in removeList:
parent = tree.find('ogr:TZwards')
parent.remove(tag)
out = ET.ElementTree(tree)
out.write(outputfilepath)
Desired output:
<?xml version="1.0" encoding="utf-8" ?>
<ogr:FeatureCollection>
<gml:boundedBy></gml:boundedBy>
<gml:featureMember>
<ogr:TZwards fid="F0">
<ogr:Region_Nam>TARGET</ogr:Region_Nam>
<ogr:District_N>Kondoa</ogr:District_N>
<ogr:Ward_Name>Bumbuta</ogr:Ward_Name>
</ogr:TZwards>
</gml:featureMember>
</ogr:FeatureCollection>
My output still contains all nodes..
You need to declare the namespaces in the python code:
from xml.dom import minidom
import xml.etree.ElementTree as ET
tree = ET.parse('/tmp/input.xml').getroot()
namespaces = {'gml': 'http://www.opengis.net/gml', 'ogr':'http://ogr.maptools.org/'}
for child in tree.findall('gml:featureMember', namespaces=namespaces):
if len(child.find('ogr:TZwards', namespaces=namespaces)):
name = child.find('ogr:TZwards', namespaces=namespaces).find('ogr:Region_Nam', namespaces=namespaces).text
if name != 'TARGET':
tree.remove(child)
out = ET.ElementTree(tree)
out.write("/tmp/out.xml")
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.