简体   繁体   English

Python:可以使用elementTree迭代子元素

[英]Python: Can iterate sub elements using elementTree

I have the following code to parse an XML but it just won't let me iterate through the children: 我有以下代码来解析XML,但它不会让我遍历子级:

import urllib, urllib2, re, time, os
import xml.etree.ElementTree as ET 

def wgetUrl(target):
    try:
        req = urllib2.Request(target)
        req.add_header('User-Agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.0.3 Gecko/2008092417 Firefox/3.0.3')
        response = urllib2.urlopen(req)
        outtxt = response.read()
        response.close()
    except:
        return ''
    return outtxt

newUrl = 'http://feeds.rasset.ie/rteavgen/player/playlist?showId=10056467'

data = wgetUrl(newUrl)
tree = ET.fromstring(data)
#tree = ET.parse(data)
for elem in tree.iter('entry'):
    print elem.tag, elem.attrib

Now, If I remove 'entry' from the iter I get an output like this (Why the URL??): 现在,如果我从iter中删除“ entry”,我将得到如下输出(为什么是URL?):

{http://www.w3.org/2005/Atom}entry {}
{http://www.w3.org/2005/Atom}id {}
{http://www.w3.org/2005/Atom}published {}
{http://www.w3.org/2005/Atom}updated {}
{http://www.w3.org/2005/Atom}title {'type': 'text'}

But, If I put the iter statement like this it still does not find the children to entry: 但是,如果我这样放置iter语句,它仍然找不到要输入的子项:

for elem in tree.iter('{http://www.w3.org/2005/Atom}entry'):
    print elem.tag, elem.attrib

I still only get the entry element on it's own, not the children: 我仍然只获得entry元素本身,而不是子元素:

{http://www.w3.org/2005/Atom}entry {}

Any idea what I am doing wrong? 知道我在做什么错吗?

I have searched everywhere but can't figure this out... I am new to all this so sorry if it is something stupid. 我到处搜索过,但无法弄清楚……我是这一切的新手,对不起,如果这很愚蠢。

If you are parsing a Atom feed, you really want to use the feedparser library instead, which takes care of all these details for you and many more. 如果您要解析Atom提要,则您确实要使用feedparser ,该将为您以及所有其他工作处理所有详细信息。

The {http://www.w3.org/2005/Atom} part is a namespace. {http://www.w3.org/2005/Atom}部分是名称空间。 You need to specify that namespace to select the entry tags: 您需要指定该命名空间以选择entry标签:

for elem in tree.iterfind('ns:entry', {'ns': 'http://www.w3.org/2005/Atom'}):

where I used a dictionary to map the ns: prefix to the namespace, or you can use the same curly braces syntax: 我在其中使用字典将ns:前缀映射到名称空间,或者可以使用相同的花括号语法:

for elem in tree.iterfind('{http://www.w3.org/2005/Atom}entry'):

Once you have the element, you still need to explicitly find it's children: 有了元素后,仍然需要显式地找到它的子元素:

for elem in tree.iterfind('{http://www.w3.org/2005/Atom}entry'):
    for child in elem:
        print child

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用 ElementTree 将新项目插入 XML 子元素的 Python 更简单方法 - Python easier way to insert new items into XML sub sub elements using ElementTree 将子元素添加到python elementtree中新创建的元素 - Add sub-elements to newly created elements in python elementtree 如何使用 Python ElementTree 获取元素树的所有子元素? - How to get all sub-elements of an element tree with Python ElementTree? 在python中使用ElementTree从aml中删除元素 - Removing elements from an aml using ElementTree in python 使用Python在属性中查找ElementTree中的所有元素 - Find all elements in ElementTree by attribute using Python 使用带有重复元素的 ElementTree 解析 Python 中的 XML - Parsing a XML in Python using ElementTree with repeat elements 如何使用 ElementTree 在 Python 中迭代 XML 标签并保存到 CSV - How to iterate over XML tags in Python using ElementTree & save to CSV 如何使用 ElementTree 在 Python 中递归迭代 XML 标记? - how to recursively iterate over XML tags in Python using ElementTree? Python版本2.7:XML ElementTree:如何迭代子元素的某些元素以便找到匹配项 - Python version 2.7: XML ElementTree: How to iterate through certain elements of a child element in order to find a match 使用python ElementTree解析XML文件中的未知元素 - parsing an xml file for unknown elements using python ElementTree
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM