简体   繁体   中英

Can't access child elements in XML

I was trying to parse a string in XML format into some plain python objects, and I tried to use the find and findall methods to access some child elements but it doesn't work .

Here is the XML data I'am trying to parse :

<?xml version="1.0" ?>
<ItemSearchResponse
    xmlns="http://webservices.amazon.com/AWSECommerceService/2011-08-01">

    <Items>
        <Request>
            <IsValid>True</IsValid>
            <ItemSearchRequest>
                <Keywords>iphone</Keywords>
                <ResponseGroup>ItemAttributes</ResponseGroup>
                <SearchIndex>All</SearchIndex>
            </ItemSearchRequest>
        </Request>
        <TotalResults>40721440</TotalResults>
        <TotalPages>4072144</TotalPages>

        <Item>
            <ASIN>B00YV50QU4</ASIN>
            <ParentASIN>B018GTHAKO</ParentASIN>
            <DetailPageURL>http://www.amazon.com/Apple-iPhone-MD439LL-Smartphone-Refurbished/dp/B00YV50QU4%3Fpsc%3D1%26SubscriptionId%3DAKIAIEEA4BKMTHTI2T7A%26tag%3Dshopit021-20%26linkCode%3Dxm2%26camp%3D2025%26creative%3D165953%26creativeASIN%3DB00YV50QU4</DetailPageURL>
            <ItemLinks>

            </ItemLinks>
            <ItemAttributes>
            </ItemAttributes>
        </Item>
        <Item>
            <ASIN>B00VHSXBUA</ASIN>
            <ParentASIN>B0152TROY8</ParentASIN>
            <ItemAttributes>
            </ItemAttributes>
        </Item>
    </Items>
</ItemSearchResponse>

I've deleted some data, in order to make this sample shorter .

And here is my code .

data  = et.fromstring(response)
            items = data[0][3]
            print items.tag
            items = data[0].findall('item')
            print len(items.findall('.//item'))

The first way to access child nodes ('item') is to use the list index notation, and it's working fine. But using the find all method it's not working and the len() always return 0 .

I've tried to use XPath and other ways, but using the index is the only way to get it work .

Why methods like find and findall don't work ?

Why methods like find and findall don't work ?

Because there are no elements named Item . Your document defines a default XML namespace of http://webservices.amazon.com/AWSECommerceService/2011-08-01 , which means that an element that looks like <Item> in your document is actually contained in that namespace and is different from an element that looks like <Item> in a document without a default XML namespace (or with a different XML namespace).

You want something like:

>>> ns = 'http://webservices.amazon.com/AWSECommerceService/2011-08-01'
>>> items = data[0].findall('{%s}Item' % ns)
>>> items
[<Element {http://webservices.amazon.com/AWSECommerceService/2011-08-01}Item at 0x7f1cbaaba8c0>, <Element {http://webservices.amazon.com/AWSECommerceService/2011-08-01}Item at 0x7f1cbaaba680>]

Or, using XPath:

>>> items = data[0].xpath('n:Item', namespaces={'n': ns})
>>> items
[<Element {http://webservices.amazon.com/AWSECommerceService/2011-08-01}Item at 0x7f1cbaaba8c0>, <Element {http://webservices.amazon.com/AWSECommerceService/2011-08-01}Item at 0x7f1cbaaba680>]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM