[英]How to parse and fetch only the desired XML elements from an XML file using python?
I have an XML file which looks like this: 我有一个看起来像这样的XML文件:
<rpc-reply xmlns:junos="http://xml.juniper.net/junos/15.1R5/junos">
<vlan-information xmlns="http://xml.juniper.net/junos/15.1R5/junos-esw" junos:style="brief">
<vlan-terse/>
<vlan>
<vlan-instance>0</vlan-instance>
<vlan-name>ACRS-Dev2</vlan-name>
<vlan-create-time>Fri Jan 1 00:37:59 2010
</vlan-create-time>
<vlan-status>Enabled</vlan-status>
<vlan-owner>static</vlan-owner>
<vlan-tag>0</vlan-tag>
<vlan-index>2</vlan-index>
<vlan-l3-interface>vlan.15 (UP)</vlan-l3-interface>
<vlan-l3-interface-address>10.8.25.1/24</vlan-l3-interface-address>
<vlan-protocol-port>Port Mode</vlan-protocol-port>
<vlan-members-count>7</vlan-members-count>
<vlan-members-upcount>6</vlan-members-upcount>
</vlan>
<vlan>
<vlan-instance>0</vlan-instance>
<vlan-name>default</vlan-name>
<vlan-create-time>Fri Jan 1 00:37:59 2010
</vlan-create-time>
<vlan-status>Enabled</vlan-status>
<vlan-owner>static</vlan-owner>
<vlan-tag>0</vlan-tag>
<vlan-index>3</vlan-index>
<vlan-l3-interface>vlan.11 (UP)</vlan-l3-interface>
<vlan-l3-interface-address>10.8.27.1/24</vlan-l3-interface-address>
<vlan-protocol-port>Port Mode</vlan-protocol-port>
<vlan-members-count>12</vlan-members-count>
<vlan-members-upcount>2</vlan-members-upcount>
</vlan>
</vlan-information>
</rpc-reply>
From this, I only want the <vlan-name>
and <vlan-l3-interface-address>
tags which are to be parsed and saved in a dict/json like variable with it's format being: 由此,我只希望将
<vlan-name>
和<vlan-l3-interface-address>
标记解析并保存在dict / json之类的变量中,其格式为:
{'Vlan-Name' : vlan_name, 'Interface-Address' : interface_addr}
and then add these dict/json for each element in a list of dicts/json. 然后为dicts / json列表中的每个元素添加这些dict / json。 This is my code for parsing and insertion of the json in list:
这是我用于解析和插入列表中的json的代码:
root = tree.getroot()
nw_pool = []
nw_json = {}
for child in root:
for items in child:
for item1 in items:
if 'vlan-l3-interface-address' in item1.tag:
interface_addr = item1.text
nw_json['Interface-Address'] = interface_addr
elif 'vlan-name' in item1.tag:
vlan_name = item1.text
nw_json['Vlan-Name'] = vlan_name
nw_pool.append(nw_json)
print(nw_pool)
But when I print the nw_pool
, it gives me an output where the json of the last element found is repeated instead of giving me distinct dicts for each element. 但是当我打印
nw_pool
,它给了我一个输出,在该输出中重复找到的最后一个元素的json,而不是为每个元素提供不同的命令。
Output: 输出:
[{'Vlan-Name': 'default', 'Interface-Address': '10.8.27.1/24'}, {'Vlan-Name': 'default', 'Interface-Address': '10.8.27.1/24'}]
Whereas my desired output is: 而我想要的输出是:
[{'Vlan-Name': 'ACRS-Dev2', 'Interface-Address': '10.8.25.1/24'}, {'Vlan-Name': 'default', 'Interface-Address': '10.8.27.1/24'}]
Can somebody help me with this? 有人可以帮我吗? Thanks in advance.
提前致谢。
You are overwriting the existing dict, while you need a new one for every iteration. 您将覆盖现有字典,而每次迭代都需要一个新字典。 So, you need to put
nw_json = {}
in another place: 因此,您需要将
nw_json = {}
放在另一个位置:
root = tree.getroot()
nw_pool = []
for child in root:
for items in child:
nw_json = {} # Work with new dict
for item1 in items:
if 'vlan-l3-interface-address' in item1.tag:
interface_addr = item1.text
nw_json['Interface-Address'] = interface_addr
elif 'vlan-name' in item1.tag:
vlan_name = item1.text
nw_json['Vlan-Name'] = vlan_name
nw_pool.append(nw_json)
print(nw_pool)
The problem in your code is you have initiated the dict() object prior to the loop so the data has been overwritten in the flow. 代码中的问题是您在循环之前启动了dict()对象,因此流中的数据已被覆盖。
@Hoenie's answer gives clarity about your mistake. @Hoenie的答案可以使您清楚地知道自己的错误。
Adding to that, I would suggest you to try BeautifulSoup for parsing XML as it is simple and easy to understand. 除此之外,我建议您尝试使用BeautifulSoup解析XML,因为它简单易懂。 Try the below code.
试试下面的代码。
from bs4 import BeautifulSoup
fileObj = open('test.xml').read()
soup = BeautifulSoup(fileObj, 'lxml')
vlans = soup.findAll('vlan')
nw_pool = []
for vlan in vlans:
nw_json = dict()
nw_json['Interface-Address'] = vlan.find('vlan-l3-interface-address').text
nw_json['Vlan-Names'] = vlan.find('vlan-name').text
nw_pool.append(nw_json)
print(nw_pool) # O/P [{'Interface-Address': '10.8.25.1/24', 'Vlan-Names': 'ACRS-Dev2'}, {'Interface-Address': '10.8.27.1/24', 'Vlan-Names': 'default'}]
Cheers! 干杯!
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.