简体   繁体   中英

key error in xml.find()

It seems the key can't be "" for the namespace map. It seems not playing nicely with the dynamic fetch of the namespace using iterparse and start-ns . Here is an example:

from xml.etree.ElementTree import fromstring, iterparse
import io
s = '''<?xml version="1.0"?>
<actors xmlns:fictional="http://characters.example.com"
        xmlns="http://people.example.com">
    <actor>
        <name>John Cleese</name>
        <fictional:character>Lancelot</fictional:character>
        <fictional:character>Archie Leach</fictional:character>
    </actor>
    <actor>
        <name>Eric Idle</name>
        <fictional:character>Sir Robin</fictional:character>
        <fictional:character>Gunther</fictional:character>
        <fictional:character>Commander Clement</fictional:character>
    </actor>
</actors>'''
# root = fromstring(s)
f = io.StringIO(s)
# get namespace dynamically
ns_map = {}
for event, elem in iterparse(f, ['start-ns']):
    ns, url = elem
    ns_map[ns] = url

f = io.StringIO(s)
root = parse(f)
ns_map
#{'': 'http://uniprot.org/uniprot',
# 'xsi': 'http://www.w3.org/2001/XMLSchema-instance'}

root.find(':actor', ns_map)

It will give the following error:

    262     try:
--> 263         selector = _cache[cache_key]
    264     except KeyError:

KeyError: (':actor', (('', 'http://people.example.com'), ('fictional', 'http://characters.example.com')))

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
<ipython-input-32-b6e4ecf5f7e9> in <module>()
----> 1 x.find(':actor', ns_map)

/Users/zech/anaconda/envs/py35/lib/python3.5/xml/etree/ElementTree.py in find(self, path, namespaces)
    647                 FutureWarning, stacklevel=2
    648                 )
--> 649         return self._root.find(path, namespaces)
    650
    651     def findtext(self, path, default=None, namespaces=None):

/Users/zech/anaconda/envs/py35/lib/python3.5/xml/etree/ElementPath.py in find(elem, path, namespaces)
    296
    297 def find(elem, path, namespaces=None):
--> 298     return next(iterfind(elem, path, namespaces), None)
    299
    300 ##

/Users/zech/anaconda/envs/py35/lib/python3.5/xml/etree/ElementPath.py in iterfind(elem, path, namespaces)
    275         while 1:
    276             try:
--> 277                 selector.append(ops[token[0]](next, token))
    278             except StopIteration:
    279                 raise SyntaxError("invalid path")

KeyError: ':'

Is this a bug? what is the best solution to this?

You have to give an non-empty "alias" to the empty namespace:

ns_map["empty"] = ns_map[""]
print(root.find('empty:actor', ns_map))

You can also remove the namespaces out of the parsed document if this is applicable in your case.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM