[英]extract a specific tag from xml file using beautiful soup in python
I have an xml file (lets call is abc.xml) which looks like this.我有一个看起来像这样的 xml 文件(让我们调用的是 abc.xml)。
<?xml version="1.0" encoding="UTF-8"?>
<properties>
<product name="XYZ" version="123"/>
<application-links>
<application-links>
<id>111111111111111</id>
<name>Link_1</name>
<primary>true</primary>
<type>applinks.ABC</type>
<display-url>http://ABC.displayURL</display-url>
<rpc-url>http://ABC.displayURL</rpc-url>
</application-links>
</application-links>
</properties>
my python code is like this我的 python 代码是这样的
f = open ('file.xml', 'r')
from bs4 import BeautifulSoup
soup = BeautifulSoup(f,'lxml')
print(soup.product)
for applinks in soup.application-links:
print(applinks)
which prints the following打印以下内容
<product name="XYZ" version="123"></product>
Traceback (most recent call last):
File "parse.py", line 7, in <module>
for applinks in soup.application-links:
NameError: name 'links' is not defined
Please can you help me understand how to print lines which have tags including a dash/hyphen '-'请你能帮我理解如何打印包含破折号/连字符'-'的标签的行
I don't know if beautifulsoup
is the best option here, but I really suggest using the ElementTree
module in python like so:我不知道beautifulsoup
是否是这里的最佳选择,但我真的建议在 python 中使用ElementTree
模块,如下所示:
>>> import xml.etree.ElementTree as ET
>>> root = ET.parse('file.xml').getroot()
>>> for app in root.findall('*/application-links/'):
... print(app.text)
111111111111111
Link_1
true
applinks.ABC
http://ABC.displayURL
http://ABC.displayURL
So, to print the value inside the <name>
tag, you can do so:因此,要打印<name>
标记内的值,您可以这样做:
>>> for app in root.findall('*/application-links/name'):
... print(app.text)
Link_1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.