简体   繁体   中英

Python, LXML ElementTree not making elementtree out of STYLE element

I'm trying to get elements from an XML. Seems to be working fine for the entire document except when I hit anything in the STYLE element. Then lxml doesn't build the tree. Returns to normal function once past the element. I was thinking it might have been a reserved element name, but I cannot find anything confirming this. Maybe I'm missing something blatantly obvious...

import requests
import lxml.html

response = requests.get('http://www.beerxml.com/recipes.xml')

def depth(node):
    d = 0
    while node is not None:
        d += 1
        node = node.getparent()
    return d

tree = lxml.html.fromstring(response.content)

for recipe in tree:
  for child in recipe.iter():
    print(depth(child),child.tag, '\t\t\t',child.text)

Result:

5 style              
 <NAME>Witbier</NAME>
 <VERSION>1</VERSION>
 <CATEGORY>Belgian &amp; French Ale</CATEGORY>
 <CATEGORY_NUMBER>1</CATEGORY_NUMBER>
...

Expected result:

5 style              
6 name Witbier
6 version 1
6 category Belgian &amp; French Ale
6 category_number 1
....

Use import lxml.etree instead import lxml.html

And replace

tree = lxml.html.fromstring(response.content)

with

tree = lxml.etree.fromstring(response.content)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM