简体   繁体   English

子节点的Python / XML数量

[英]Python/XML number of child nodes

what I want to get from below xml file is: if number of <broader> nodes inside <term> node > 1 AND any of these <broader> node's value is equal to <id> node's value THEN print <value> node text. 我想从下面的xml文件中得到的是:如果<term>节点> 1内的<broader>节点数,并且这些<broader>节点的值中的任何一个等于<id>节点的值,则打印<value>节点文本。

            <results>
            <term>
                <altLabel>
                    <value>Label1</value>
                </altLabel>
                <broader>11</broader>
                <id>1</id>
            </term>
            <term>
                <altLabel>
                    <value>Label2</value>
                </altLabel>
                <broader>22</broader>
                <broader>2</broader>
                <id>2</id>
            </term>
            <term>
                <altLabel>
                    <value>Label3</value>
                </altLabel>
                <broader>3</broader>
                <broader>33</broader>
                <id>3</id>
            </term>
            <term>
                <altLabel>
                    <value>Label4</value>
                </altLabel>
                <broader>44</broader>
                <broader>44</broader>
                <id>4</id>
            </term>
        </results>

So that for above XML I suppose to get: 因此,对于上述XML,我想得到:

Label2
Label3

NOTE: number of child nodes inside <term> node may vary. 注意: <term>节点内的子节点数可能有所不同。 This is just a sample xml, so that I'm not interested in pointing on any specific element of table. 这只是一个示例xml,因此我对指向表的任何特定元素都不感兴趣。

What you can do, using BeautifulSoup is to loop over all term tags, and check if their id text is equal to any of their broader texts: 使用BeautifulSoup ,您可以做的是遍历所有term标签,并检查其id文本是否等于其broader文本中的任何一个:

from bs4 import BeautifulSoup
soup = BeautifulSoup(doc, 'lxml') #  doc is your string
termList = soup.findAll("term")
for term in termList:
    if len(term.findAll("broader")) > 1:
        for broader in term.findAll("broader"):
            if term.id.text == broader.text:
                print(term.value.text)

Will print: 将打印:

Label2
Label3

using builtin xml module, the sytax is quite similar to beautifulsoup :) 使用内置的xml模块,该语法非常类似于beautifulsoup :)

Replace path_to_xml to your xml file path path_to_xml替换为您的xml文件路径

from xml.etree import cElementTree as ET
xml_dat = ET.parse(path_to_xml).getroot()
for term in xml_dat.iter('term'):
    broaders = term.findall('broader')
    if len(broaders) > 1:
        for broader in broaders:
            if term.find('id').text == broader.text:
                print(term.find('altLabel').find('value').text)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM