BeautifulSoup doesn't parse XML loaded from local file

Question

My Python script utilizing BeautifulSoup gets None when attempting to parse (find an element from) XML from a locally loaded file:

xmlData = None

with open('conf//test2.xml', 'r') as xmlFile:
    xmlData = xmlFile.read()

# this creates a soup object out of xmlData,
# which is properly loaded from file above
xmlSoup = BeautifulSoup(xmlData, "html.parser")

# this resolves to None
subElemX = xmlSoup.root.singleelement.find('subElementX', recursive=False)

The file:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<root>
    <singleElement>
        <subElementX>XYZ</subElementX>
    </singleElement>
    <repeatingElement id="1"/>
    <repeatingElement id="2"/>
</root>

I also have a REST GET service that returns the same XML but when I read that using requests.get , it is parsed fine:

resp = requests.get(serviceURL, headers=headers)

respXML = resp.content.decode("utf-8")

restSoup = BeautifulSoup(respXML, "html.parser")

Why does it work with the REST response and not with the data read out of a local file?

UPDATE: While I understand that python is case sensitive and single e lement !=single E lement, the case is disregarded when parsing the web service.

Answer 1

Two things to make it work:

change the features from html.parser to xml (you are parsing XML data, XML != HTML)
change singleelement to singleElement

Changes applied (works for me):

xmlSoup = BeautifulSoup(xmlData, "xml")

subElemX = xmlSoup.root.singleElement.find('subElementX', recursive=False)
print(subElemX)  # prints <subElementX>XYZ</subElementX>

Answer 2

Apparently, HTML is a case-insensitive language, so html.parser internally converts all tag names to lower case. Given that, the following line should work:

subElemX = xmlSoup.root.singleelement.find('subelementx', recursive=False)

But in general, you shouldn't parse XML documents with HTML parser. XML is quite strict about its syntax, and that's for a good reason.

BeautifulSoup doesn't parse XML loaded from local file

Question

2 answers

solution1
2 ACCPTED 2016-11-16 18:18:51

solution2
1 2016-11-16 18:22:56

BeautifulSoup doesn't parse XML loaded from local file

Question

2 answers

solution1 2 ACCPTED 2016-11-16 18:18:51

solution2 1 2016-11-16 18:22:56

solution1
2 ACCPTED 2016-11-16 18:18:51

solution2
1 2016-11-16 18:22:56