如何使用python检索xml中的标签

Question

下面是我的 xml 文档：

<arm_group>
  <arm_group_label>Phase I, Part A </arm_group_label>
  <arm_group_type>Experimental</arm_group_type>
  <description>Dose escalation </description>
</arm_group>
<arm_group>
  <arm_group_label>Phase I, Part B </arm_group_label>
  <arm_group_type>Experimental</arm_group_type>
  <description>Dose escalation and safety </description>
</arm_group>
<arm_group>
  <arm_group_label>Phase IIa - Part A Expansion </arm_group_label>
  <arm_group_type>Experimental</arm_group_type>
  <description>Repeated administrations </description>
 </arm_group>

下面是我的代码：

import xml.etree.ElementTree as ET

ids = []
contents = []
for file in os.listdir('xml/'):
    if '.xml' in file:       
        tree = ET.parse(f'xml/{file}')
        root = tree.getroot()
        armGrpLabel =[]
        for x in root.findall('arm_group/arm_group_label'):
            armGrpLabel.append(x.text)
        armGrpType = []
        for x in root.findall('arm_group/arm_group_type'):
            armGrpType.append(x.text)
        armDesc = []
        for x in root.findall('arm_group/description'):
            armDesc.append(x.text)
       armGrpLabel = '\n'.join(armGrpLabel).replace('\t','').replace('\n\n','\n').replace('\r','')
        armGrpType = '\n'.join(armGrpType).replace('\t','').replace('\n\n','\n').replace('\r','')
        armDesc = '\n'.join(armDesc).replace('\t','').replace('\n\n','\n').replace('\r','')
 text =  (armGrpLabel) + '\n\n'+(armGrpType) + '\n\n' +(armDesc)
        contents.append(text)
        ids.append(file[:-4])

我得到如下输出：

第一阶段，A部分

实验性的

剂量递增

剂量递增和安全性

重复给药

但是，想要如下输出：

输出应如下所示：

第一阶段，A部分

实验性的

剂量递增

第一阶段，B部分

实验性的

剂量递增和安全性

阶段 IIa - A 部分扩展

实验性的

重复给药

Answer 1

这是使用 xmlstarlet 完成的方法。 你只需要在 python 中做类似的事情。

xmlstarlet sel --template --match '/*/*/*' --value-of 'text()' --nl input.xml

如何使用python检索xml中的标签

问题描述

1 个解决方案

解决方案1
0 2022-05-24 13:06:55

如何使用python检索xml中的标签

问题描述

1 个解决方案

解决方案1 0 2022-05-24 13:06:55

解决方案1
0 2022-05-24 13:06:55