我试图解析一个网络博客页面,并将某些数据提取到一个列表中。 这是xml。

http://www-01.ibm.com/software/support/lifecycle/rss/PLCWeeklyXMLDownload.xml

有多个记录,但是我需要从每个记录中提取软件标题,版本号,发行版号,ModLevelNumber和服务终止日期(如果有)并将它们放入列表中

我正在运行python代码,但对XML来说是新的,不胜感激

 def myDownload():
   import xml.etree.ElementTree as et
   import urllib.request
   response = urllib.request.urlopen("http://www-01.ibm.com/software/support/lifecycle/rss/PLCWeeklyXMLDownload.xml")
   tree = et.parse(response)
   root = tree.getroot()
   aList=[]

   for child in root:
      for node in child.findall("SWTitle"):
        title = node.text
        aList.append(title)
      for nodes in child.findall("Versions"):
        for version in nodes.findall("Version"):
          for release in version.findall("Release_Mods"):
            for mod in release.findall("Release_Mod"):
              rNum = mod.find("releaseNumber")
              rNumber = rNum.text
              nNum = mod.find("modLevelNumber")
              nNumber=nNum.text
              aList.append(rNumber)
              aList.append(nNumer)

任何人都可以帮助调整此代码,因为它似乎不起作用

===============>>#1 票数:1

使用lxml库来解析xml。 ElementTree无法与更多嵌套标签一起使用。

===============>>#2 票数:0

您可以为此使用lxml库:

import requests
from lxml import etree

r = requests.get('http://www-01.ibm.com/software/support/lifecycle/rss/PLCWeeklyXMLDownload.xml')
xml = r.content
xml_dom = etree.fromstring(xml)

# Iterate over <SWTitleRecord>
for record_node in xml_dom:
    data = {}
    for attr_node in record_node:
        if attr_node.tag == 'SWTitle'
            data['title'] = attr_node.text
        elif attr_node.tag == 'Versions':
            # parse versions
    ...       

  ask by BAI translate from so

未解决问题?本站智能推荐: