繁体   English   中英

Python-lxml重新排序xml标签

[英]Python - lxml to re-order xml tags

我的xml中有一些部分需要重新排序,我知道xml不需要重新排序,但这是我需要做的,但无法弄清楚这样做的“正确”方法。 我正在使用lxml,并且一直在使用.insert命令重新排序。 我需要对每个<asset type="preview">每个标签重新排序,以使其看起来像这样:

    <asset type="preview">
        <territories>
            <territory>SE</territory>
        </territories>
        <data_file role="source">
            <locale name="es"/>
            <file_name>some_name_nor-preview-sv.mov</file_name>
            <size>1715119116</size>
            <checksum type="md5">55cd94d051700be34014b2892e925fa1</checksum>
            <attribute name="crop.top">25</attribute>
            <attribute name="crop.bottom">25</attribute>
            <attribute name="crop.left">4</attribute>
            <attribute name="crop.right">4</attribute>
            <attribute name="image.burned_subtitles.locale">sv</attribute>
            <attribute name="image.textless_master">false</attribute>
        </data_file>
    </asset>

我有时有多个<asset type="preview"> ,有时没有。 同样,有时每个<asset type="preview">都不包含此处列出的所有标记。 这是我要如上所述重新排序的xml部分。

    <asset type="preview">
        <data_file role="source">
            <size>1657800204</size>
            <file_name>some_name_nor-preview.mov</file_name>
            <checksum type="md5">c61dfa7139ab04560cac41cf5ba8a1f2</checksum>
            <locale name="es"/>
            <attribute name="crop.top">25</attribute>
            <attribute name="crop.right">4</attribute>
            <attribute name="crop.bottom">25</attribute>
            <attribute name="crop.left">4</attribute>
        </data_file>
        <territories>
            <territory>WW</territory>
        </territories>
        <data_file role="notes">
            <size>9642</size>
            <file_name>some_name_nor-preview-notes.pdf</file_name>
            <checksum type="md5">4d0dc3534cd1d0f9885afbfda9be8b71</checksum>
        </data_file>
    </asset>
    <asset type="preview">
        <data_file role="source">
            <size>1715119116</size>
            <file_name>some_name_nor-preview-sv.mov</file_name>
            <checksum type="md5">55cd94d051700be34014b2892e925fa1</checksum>
            <locale name="es"/>
            <attribute name="image.burned_subtitles.locale">sv</attribute>
            <attribute name="crop.top">25</attribute>
            <attribute name="crop.right">4</attribute>
            <attribute name="image.textless_master">false</attribute>
            <attribute name="crop.left">4</attribute>
            <attribute name="crop.bottom">25</attribute>
        </data_file>
        <territories>
            <territory>SE</territory>
        </territories>
    </asset>
    <asset type="preview">
        <data_file role="source">
            <size>1709158524</size>
            <file_name>some_name_nor-preview-fi.mov</file_name>
            <checksum type="md5">58c5fcfa718393f76cb9b2d8f7c10362</checksum>
            <locale name="es"/>
            <attribute name="crop.bottom">25</attribute>
            <attribute name="crop.top">25</attribute>
            <attribute name="crop.left">4</attribute>
            <attribute name="image.textless_master">false</attribute>
            <attribute name="crop.right">4</attribute>
            <attribute name="image.burned_subtitles.locale">fi</attribute>
        </data_file>
        <territories>
            <territory>FI</territory>
        </territories>
    </asset>
    <asset type="preview">
        <territories>
            <territory>NO</territory>
        </territories>
        <data_file role="source">
            <size>1718632572</size>
            <file_name>some_name_nor-preview-no.mov</file_name>
            <checksum type="md5">41734d9d8dd4165416a4369f4ce9c8e1</checksum>
            <locale name="es"/>
            <attribute name="crop.left">4</attribute>
            <attribute name="crop.top">25</attribute>
            <attribute name="crop.bottom">25</attribute>
            <attribute name="image.textless_master">false</attribute>
            <attribute name="image.burned_subtitles.locale">no</attribute>
            <attribute name="crop.right">4</attribute>
        </data_file>
    </asset>
    <asset type="preview">
        <territories>
            <territory>DK</territory>
        </territories>
        <data_file role="source">
            <size>1721312028</size>
            <file_name>some_name_nor-preview-da.mov</file_name>
            <checksum type="md5">919abd17baf680161a220dbae8409918</checksum>
            <locale name="es"/>
            <attribute name="image.textless_master">false</attribute>
            <attribute name="crop.bottom">25</attribute>
            <attribute name="image.burned_subtitles.locale">da</attribute>
            <attribute name="crop.right">4</attribute>
            <attribute name="crop.left">4</attribute>
            <attribute name="crop.top">25</attribute>
        </data_file>
    </asset>

这是我当前的“无效”代码,没有对attribute[@name=标签进行重新排序,不知道这是正确的方法:

        a = 0
        b = 0
        for node_search in tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']"):
            for element in node_search.iter(tag='locale'):
                node_products = tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']")[b]
                node_type = tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']/locale")[b]
                node_products.insert(a, node_type)
                b = b+1
            a = a+1
            b = 0
        for node_search in tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']"):
            for element in node_search.iter(tag='file_name'):
                node_products = tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']")[b]
                node_type = tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']/file_name")[b]
                node_products.insert(a, node_type)
                b = b+1
            a = a+1           
            b = 0
        for node_search in tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']"):
            for element in node_search.iter(tag='size'):
                node_products = tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']")[b]
                node_type = tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']/size")[b]
                node_products.insert(a, node_type)
                b = b+1
            a = a+1
            b = 0
        for node_search in tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']"):
            for element in node_search.iter(tag='checksum'):
                node_products = tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']")[b]
                node_type = tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']/checksum")[b]
                node_products.insert(a, node_type)
                b = b+1
            a = a+1
            b = 0
        for node_search in tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']"):
            for element in node_search.iter(tag="attribute[@name='crop.top']"):
                node_products = tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']")[b]
                node_type = tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']/attribute[@name='crop.top']")[b]
                node_products.insert(a, node_type)
                b = b+1
            a = a+1
            b = 0
        for node_search in tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']"):
            for element in node_search.iter(tag="attribute[@name='crop.bottom']"):
                node_products = tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']")[b]
                node_type = tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']/attribute[@name='crop.bottom']")[b]
                node_products.insert(a, node_type)
                b = b+1
            a = a+1
            b = 0
        for node_search in tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']"):
            for element in node_search.iter(tag="attribute[@name='crop.left']"):
                node_products = tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']")[b]
                node_type = tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']/attribute[@name='crop.left']")[b]
                node_products.insert(a, node_type)
                b = b+1
            a = a+1
            b = 0
        for node_search in tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']"):
            for element in node_search.iter(tag="attribute[@name='crop.right']"):
                node_products = tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']")[b]
                node_type = tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']/attribute[@name='crop.right']")[b]
                node_products.insert(a, node_type)
                b = b+1
            a = a+1
            b = 0
        for node_search in tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']"):
            for element in node_search.iter(tag="attribute[@name='image.burned_forced_narrative.locale']"):
                node_products = tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']")[b]
                node_type = tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']/attribute[@name='image.burned_forced_narrative.locale']")[b]
                node_products.insert(a, node_type)
                b = b+1
            a = a+1
            b = 0
        for node_search in tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']"):
            for element in node_search.iter(tag="attribute[@name='image.burned_subtitles.locale']"):
                node_products = tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']")[b]
                node_type = tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']/attribute[@name='image.burned_subtitles.locale']")[b]
                node_products.insert(a, node_type)
                b = b+1
            a = a+1
            b = 0
        for node_search in tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']"):
            for element in node_search.iter(tag="attribute[@name='image.textless_master']"):
                node_products = tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']")[b]
                node_type = tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']/attribute[@name='image.textless_master']")[b]
                node_products.insert(a, node_type)
                b = b+1
            a = a+1
            b = 0

我不清楚您的要求。 以下代码按此顺序对每个asset_preview进行排序:

unknown tags
<territories>
unknown <data_file> roles
<data_file role=source>
<data_file role=notes>

并对每个data_file排序,如下所示:

unknown tags
<locale>
<file_name>
<size>
<checksum>
unknown attributes
<attribute name="crop.top">
other <attributes>, in a specific order.

理解此技术的关键是要认识到节点是一个列表,并且可以按照对任何列表进行重新排序的方式对其进行重新排序。 就我而言,我使用了带有自定义键的sorted()

干得好:

from lxml import etree

def preview_key(et):
    major_ordering = ['territories', 'data_file']
    minor_ordering = ['source', 'notes']
    try:
        major = major_ordering.index(et.tag)
    except ValueError:
        major = -1
    try:
        minor = minor_ordering.index(et.get('role', None))
    except ValueError:
        minor = -1
    return major, minor

def data_file_key(et):
    major_ordering = ['locale', 'file_name', 'size', 'checksum', 'attribute']
    minor_ordering = [
            "crop.top",
            "crop.bottom",
            "crop.left",
            "crop.right",
            "image.burned_subtitles.locale",
            "image.textless_master"]
    try:
        major = major_ordering.index(et.tag)
    except ValueError:
        major = -1
    try:
        minor = minor_ordering.index(et.get('name', None))
    except ValueError:
        minor = -1
    return major, minor



with open('input.xml') as input_file:
    parser = etree.XMLParser(remove_blank_text=True)
    tree = etree.parse(input_file, parser)
root = tree.getroot()

for preview in tree.xpath("//asset[@type='preview']"):
    preview[:] = sorted(preview, key=preview_key)

for data_file in tree.xpath("//data_file"):
    data_file[:] = sorted(data_file, key=data_file_key)

with open('output.xml', 'w') as output_file:
    output_file.write(etree.tostring(tree, pretty_print = True))

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM