Python ElementTree XML IOError: [Errno 22] invalid mode ('rb') or filename

Question

With the following code:

import xml.etree.cElementTree as ET
tree = ET.parse(r'https://apitest.batchbook.com/api/v1/people.xml?auth_token=GR5doLv88FrnLyLGIwok')

I get the error message:

IOError                                   Traceback (most recent call last)
<ipython-input-10-d91d452da3e7> in <module>()
----> 1 tree = ET.parse(r'https://apitest.batchbook.com/api/v1/people.xml?auth_token=GR5doLv88FrnLyLGIwok')

<string> in parse(source, parser)

<string> in parse(self, source, parser)

IOError: [Errno 22] invalid mode ('rb') or filename: 'https://apitest.batchbook.com/api/v1/people.xml?auth_token=GR5doLv88FrnLyLGIwok'

However, if I open the link above in a browser, and save this to an XML-file (people.xml), and then do:

tree = ET.parse(r'C:\Users\Eric\Downloads\people.xml')
tree.getroot()

I get the result: <Element 'people' at 0x00000000086AA420>

Any clue as to why using the link does not work? Thanks :)

Answer 1

There is no file of that name anywhere in your filesystem. etree does not understand that this is really a web address, and could not do anything with it even if it did.

Instead, you should do something like:

import xml.etree.cElementTree as ET
import urllib2, StringIO

page_with_xml = urllib2.urlopen(r'https://apitest.batchbook.com/api/v1/people.xml?auth_token=GR5doLv88FrnLyLGIwok')
io_xml = StringIO.StringIO()
io_xml.write(page_with_xml.read())
io_xml.seek(0)
tree = ET.parse(io_xml)

Editted to correct for the fact that etree.parse is looking for a file-like object. Not particularly elegant, but it gets the job done.

Python ElementTree XML IOError: [Errno 22] invalid mode ('rb') or filename

Question

1 answers

solution1
3 2014-03-19 12:39:15

Python ElementTree XML IOError: [Errno 22] invalid mode ('rb') or filename

Question

1 answers

solution1 3 2014-03-19 12:39:15

solution1
3 2014-03-19 12:39:15