简体   繁体   中英

xpath to dic python, lxml and xml

Is there a quick way by using xpath from lxml in Python to convert following xml to dictionary? Or any other efficient way?

<rec item="1">
    <tag name="atr1">random text</tag>
    <tag name="atr2">random text</tag>
    ..................................        
</rec>
<rec item="2">
    <tag name="atr1">random text2</tag>
    <tag name="atr2">random text2</tag>
    ..................................        
</rec>
<rec item="3">
    <tag name="atr1">random text3</tag>
    <tag name="atr2">random text3</tag>
    ..................................        
</rec>

need dictionary like this, or other simillar:

dic = [
    {    
        'attr1':'random text',
        'attr2':'random text'
    },
    {    
        'attr1':'random text2',
        'attr2':'random text2'
    },
    {    
        'attr1':'random text3',
        'attr2':'random text3'
    }
]

You can use a list comprehension together with a dictionary comprehension:

[{ tag.xpath('string(@name)') : tag.xpath('string()') for tag in record.xpath('tag')} for record in records.xpath('//rec')]

Here is a complete example:

from lxml import etree as ET
xml = '''<records>
<rec item="1">
    <tag name="atr1">random text</tag>
    <tag name="atr2">random text</tag>
    ..................................        
</rec>
<rec item="2">
    <tag name="atr1">random text2</tag>
    <tag name="atr2">random text2</tag>
    ..................................        
</rec>
<rec item="3">
    <tag name="atr1">random text3</tag>
    <tag name="atr2">random text3</tag>
    ..................................        
</rec>
</records>'''
records = ET.fromstring(xml)
rec_list = [{ tag.xpath('string(@name)') : tag.xpath('string()') for tag in rec.xpath('tag') } for rec in records.xpath('rec')]
print(rec_list)

Outputs

[{'atr1': 'random text', 'atr2': 'random text'}, {'atr1': 'random text2', 'atr2': 'random text2'}, {'atr1': 'random text3', 'atr2': 'random text3'}]

You can try following code:

source = lxml.etree.fromstring('xml_source_is_here')
[{attr:text} for attr,text in zip(source.xpath('//tag/@name'), source.xpath('//tag/text()'))]

Output:

[{'atr1': 'random text'}, {'atr2': 'random text'}, 
{'atr1': 'random text2'}, {'atr2': 'random text2'}, 
{'atr1': 'random text3'}, {'atr2': 'random text3'}]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM