使用 python 中的美丽汤从 xml 文件中提取特定标签

Question

I have an xml file (lets call is abc.xml) which looks like this.我有一个看起来像这样的 xml 文件（让我们调用的是 abc.xml）。

<?xml version="1.0" encoding="UTF-8"?>

<properties>
  <product name="XYZ" version="123"/>
  <application-links>
    <application-links>
      <id>111111111111111</id>
      <name>Link_1</name>
      <primary>true</primary>
      <type>applinks.ABC</type>
      <display-url>http://ABC.displayURL</display-url>
      <rpc-url>http://ABC.displayURL</rpc-url>
    </application-links>
  </application-links>
</properties>

my python code is like this我的 python 代码是这样的

f = open ('file.xml', 'r')
from bs4 import BeautifulSoup
soup = BeautifulSoup(f,'lxml')

print(soup.product)

for applinks in soup.application-links:
    print(applinks)

which prints the following打印以下内容

<product name="XYZ" version="123"></product>
Traceback (most recent call last):
  File "parse.py", line 7, in <module>
    for applinks in soup.application-links:
NameError: name 'links' is not defined

Please can you help me understand how to print lines which have tags including a dash/hyphen '-'请你能帮我理解如何打印包含破折号/连字符'-'的标签的行

Answer 1

I don't know if beautifulsoup is the best option here, but I really suggest using the ElementTree module in python like so:我不知道beautifulsoup是否是这里的最佳选择，但我真的建议在 python 中使用ElementTree模块，如下所示：

>>> import xml.etree.ElementTree as ET
>>> root = ET.parse('file.xml').getroot()
>>> for app in root.findall('*/application-links/'):
...     print(app.text)
111111111111111
Link_1
true
applinks.ABC
http://ABC.displayURL
http://ABC.displayURL

So, to print the value inside the <name> tag, you can do so:因此，要打印<name>标记内的值，您可以这样做：

>>> for app in root.findall('*/application-links/name'):
...     print(app.text)
Link_1

使用 python 中的美丽汤从 xml 文件中提取特定标签

问题描述

1 个解决方案

解决方案1
0 已采纳 2020-06-02 13:34:15

使用 python 中的美丽汤从 xml 文件中提取特定标签

问题描述

1 个解决方案

解决方案1 0 已采纳 2020-06-02 13:34:15

解决方案1
0 已采纳 2020-06-02 13:34:15