繁体   English   中英

在Python中解析嵌套的XML?

[英]Parsing nested XML in Python?

我有以下goodreads回复:

<GoodreadsResponse>
   <Request>
   </Request>
   <book>
    <popular_shelves>
        <shelf name="test" other="1"/>
        <shelf name="test2" other="2"/>
    </popular_shelves/>
   </book>
</GoodreadsResponse>

我想检索popular_shelves 2nd货架项目。 (索引1)。

尝试1:

from xml.etree import ElementTree as ET

  root = ET.parse(urllib.urlopen(baseEndpoint+bookName)).getroot()
  for atype in root.findall('book/popular_shelves'):
    print(atype.get('shelf'))

尝试2:

  genre = root.find('book').findall('popular_shelves')[0].findall('shelf')
  print genre[0].text

这就是我从popular_shelves获得2nd货架商品的popular_shelves

import xml.etree.ElementTree as ET

payload = '''
<GoodreadsResponse>
   <Request>
   </Request>
   <book>
    <popular_shelves>
        <shelf name="test" other="1"/>
        <shelf name="test2" other="2"/>
    </popular_shelves>
   </book>
</GoodreadsResponse>
'''

root = ET.fromstring(payload)
shelves = root.findall("./book/popular_shelves/shelf") # this will get you the list of shelves
print shelves[1].get('name') # fetching the name of 2nd shelf item

因此,我们可以将./book/popular_shelves下的所有shelf项目./book/popular_shelves到列表中。 然后,使用列表索引访问1st2nd等货架项目。

您可以尝试使用undangle模块,它简单易用,例如:

In [95]: from untangle import parse

In [96]: payload = '''
    ...: <GoodreadsResponse>
    ...:    <Request>
    ...:    </Request>
    ...:    <book>
    ...:     <popular_shelves>
    ...:         <shelf name="test" other="1"/>
    ...:         <shelf name="test2" other="2"/>
    ...:     </popular_shelves>
    ...:    </book>
    ...: </GoodreadsResponse>
    ...: '''

In [97]: obj = parse(payload)

In [98]: shelf1 = obj.GoodreadsResponse.book.popular_shelves.shelf[1]

In [99]: vars(shelf1)
Out[99]:
{'_attributes': {u'name': u'test2', u'other': u'2'},
 '_name': u'shelf',
 'cdata': '',
 'children': [],
 'is_root': False}

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM