簡體   English   中英

python lxml以預定義的順序寫入文件

[英]python lxml write to file in predefined order

我想編寫以下lxml etree子元素

<ElementProtocolat0x3803048>,
<ElementStudyEventDefat0x3803108>,
<ElementFormDefat0x3803248>,
<ElementItemGroupDefat0x38032c8>,
<ElementClinicalDataat0x3803408>,
<ElementItemGroupDataat0x38035c8>,
<ElementFormDefat0x38036c8>,

預定義的順序到我的odm xml文件。

<ElementProtocolat0x3803048>,
<ElementStudyEventDefat0x3803108>,
<ElementFormDefat0x3803248>,
<ElementFormDefat0x38036c8>,
<ElementItemGroupDefat0x38032c8>,
<ElementItemGroupDataat0x38035c8>,
<ElementClinicalDataat0x3803408>,
....

有什么方法可以對元素進行排序,即具有預定義的列表?

predefined_order = ['Protocol', 'StudyEventDef','FormDef','ItemGroupDef','ItemDef','CodeList']

該示例演示:

  • 如何讀取XMl文件,
  • 元素是一個列表,並且可以這樣操作
  • 如何根據可匹配子字符串的預定義順序對列表進行排序
  • 如何寫出XML文件
from lxml import etree
import re

# Parse the XML and find the root
with open('input.xml') as input_file:
    tree = etree.parse(input_file)
root = tree.getroot()

# Find the list to sort and sort it
some_arbitrary_expression_to_find_the_list = '.'
element_list = tree.xpath(some_arbitrary_expression_to_find_the_list)[0]

predefined_order = [
    'Protocol',
    'StudyEventDef',
    'FormDef',
    'ItemGroupDef',
    'ItemGroupData',
    'ItemDef',
    'CodeList',
    'ClinicalData']
filter = re.compile(r'Element(.*)at0x.*')

element_list[:] = sorted(
    element_list[:],
    key = lambda x: predefined_order.index(filter.match(x.tag).group(1)))

# Write the XML to the output file
with open('output.xml', 'w') as output_file:
    output_file.write(etree.tostring(tree, pretty_print = True))

輸入樣例:

<stuff>
<ElementProtocolat0x3803048 />
<ElementStudyEventDefat0x3803108 />
<ElementFormDefat0x3803248 />
<ElementItemGroupDefat0x38032c8>Random Text</ElementItemGroupDefat0x38032c8>
<ElementClinicalDataat0x3803408 />
<ElementItemGroupDataat0x38035c8><tag1><tag2 attr="random tags"/></tag1></ElementItemGroupDataat0x38035c8>
<ElementFormDefat0x38036c8 />
</stuff>

輸出:

<stuff>
<ElementProtocolat0x3803048/>
<ElementStudyEventDefat0x3803108/>
<ElementFormDefat0x3803248/>
<ElementFormDefat0x38036c8/>
<ElementItemGroupDefat0x38032c8>Random Text</ElementItemGroupDefat0x38032c8>
<ElementItemGroupDataat0x38035c8><tag1><tag2 attr="random tags"/></tag1></ElementItemGroupDataat0x38035c8>
<ElementClinicalDataat0x3803408/>
</stuff>

抱歉,我對xml缺乏了解,但是我嘗試僅使用Python的基本知識來對數據進行排序。

import re
data = """<ElementProtocolat0x3803048>,
<ElementStudyEventDefat0x3803108>,
<ElementFormDefat0x3803248>,
<ElementItemGroupDefat0x38032c8>,
<ElementClinicalDataat0x3803408>,
<ElementItemGroupDataat0x38035c8>,
<ElementFormDefat0x38036c8>,"""

predefined_order = ['Protocol','StudyEventDef','FormDef','ItemGroupDef','ItemGroupData','CodeList', 'ClinicalData']

fh1 = open("something.xml","w")
for i in predefined_order:
    for j in data.split(','):
        if re.search(i,j):
            fh1.write(j + ',')

輸出:

<ElementProtocolat0x3803048>,
<ElementStudyEventDefat0x3803108>,
<ElementFormDefat0x3803248>,
<ElementFormDefat0x38036c8>,
<ElementItemGroupDefat0x38032c8>,
<ElementItemGroupDataat0x38035c8>,
<ElementClinicalDataat0x3803408>,

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM