无法访问XML中的子元素

Question

I was trying to parse a string in XML format into some plain python objects, and I tried to use the find and findall methods to access some child elements but it doesn't work . 我试图将XML格式的字符串解析为一些普通的python对象，并且尝试使用find和findall方法访问某些子元素，但它不起作用。

Here is the XML data I'am trying to parse : 这是我要解析的XML数据：

<?xml version="1.0" ?>
<ItemSearchResponse
    xmlns="http://webservices.amazon.com/AWSECommerceService/2011-08-01">

    <Items>
        <Request>
            <IsValid>True</IsValid>
            <ItemSearchRequest>
                <Keywords>iphone</Keywords>
                <ResponseGroup>ItemAttributes</ResponseGroup>
                <SearchIndex>All</SearchIndex>
            </ItemSearchRequest>
        </Request>
        <TotalResults>40721440</TotalResults>
        <TotalPages>4072144</TotalPages>

        <Item>
            <ASIN>B00YV50QU4</ASIN>
            <ParentASIN>B018GTHAKO</ParentASIN>
            <DetailPageURL>http://www.amazon.com/Apple-iPhone-MD439LL-Smartphone-Refurbished/dp/B00YV50QU4%3Fpsc%3D1%26SubscriptionId%3DAKIAIEEA4BKMTHTI2T7A%26tag%3Dshopit021-20%26linkCode%3Dxm2%26camp%3D2025%26creative%3D165953%26creativeASIN%3DB00YV50QU4</DetailPageURL>
            <ItemLinks>

            </ItemLinks>
            <ItemAttributes>
            </ItemAttributes>
        </Item>
        <Item>
            <ASIN>B00VHSXBUA</ASIN>
            <ParentASIN>B0152TROY8</ParentASIN>
            <ItemAttributes>
            </ItemAttributes>
        </Item>
    </Items>
</ItemSearchResponse>

I've deleted some data, in order to make this sample shorter . 为了使此样本更短，我删除了一些数据。

And here is my code . 这是我的代码。

data  = et.fromstring(response)
            items = data[0][3]
            print items.tag
            items = data[0].findall('item')
            print len(items.findall('.//item'))

The first way to access child nodes ('item') is to use the list index notation, and it's working fine. 访问子节点（'item'）的第一种方法是使用列表索引符号，并且运行良好。 But using the find all method it's not working and the len() always return 0 . 但是使用find all方法无法正常工作，并且len()始终返回0。

I've tried to use XPath and other ways, but using the index is the only way to get it work . 我尝试使用XPath和其他方式，但是使用索引是使其工作的唯一方法。

Why methods like find and findall don't work ? 为什么像find和findall这样的方法不起作用？

Answer 1

Why methods like find and findall don't work ? 为什么像find和findall这样的方法不起作用？

Because there are no elements named Item . 因为没有名为Item元素。 Your document defines a default XML namespace of http://webservices.amazon.com/AWSECommerceService/2011-08-01 , which means that an element that looks like <Item> in your document is actually contained in that namespace and is different from an element that looks like <Item> in a document without a default XML namespace (or with a different XML namespace). 您的文档定义的默认XML命名空间http://webservices.amazon.com/AWSECommerceService/2011-08-01 ，这意味着，看起来像一个元素<Item>您的文档中实际上是包含在该命名空间，距离不同在没有默认XML名称空间（或具有其他XML名称空间）的文档中看起来像<Item>的元素。

You want something like: 您想要类似的东西：

>>> ns = 'http://webservices.amazon.com/AWSECommerceService/2011-08-01'
>>> items = data[0].findall('{%s}Item' % ns)
>>> items
[<Element {http://webservices.amazon.com/AWSECommerceService/2011-08-01}Item at 0x7f1cbaaba8c0>, <Element {http://webservices.amazon.com/AWSECommerceService/2011-08-01}Item at 0x7f1cbaaba680>]

Or, using XPath: 或者，使用XPath：

>>> items = data[0].xpath('n:Item', namespaces={'n': ns})
>>> items
[<Element {http://webservices.amazon.com/AWSECommerceService/2011-08-01}Item at 0x7f1cbaaba8c0>, <Element {http://webservices.amazon.com/AWSECommerceService/2011-08-01}Item at 0x7f1cbaaba680>]

无法访问XML中的子元素

问题描述

1 个解决方案

解决方案1
1 已采纳 2016-02-03 19:03:30

无法访问XML中的子元素

问题描述

1 个解决方案

解决方案1 1 已采纳 2016-02-03 19:03:30

解决方案1
1 已采纳 2016-02-03 19:03:30