繁体   English   中英

elementtree:获取xml文档中特定标记的内容

[英]elementtree: get the contents of specific tags in xml document

我试图在XML文件中提取特定标签的内容。

示例XML:

<facts>
        <fact>
            <name>crash</name>
            <full_name>Crash</full_name>
            <variables>
                <variable>
                    <name>id</name>
                    <proper_name>Crash Instance</proper_name>
                    <type>INT</type>
                    <interpretation>key</interpretation>
                </variable>
                <variable>
                    <name>accident_key</name>
                    <proper_name>Case Identifier</proper_name>
                    <interpretation>string</interpretation>
                    <type>CHAR(9)</type>
                </variable>
                <variable>
                    <name>accident_year</name>
                    <proper_name>Crash Year</proper_name>
                    <interpretation>dim</interpretation>
                    <type>INT</type>
                </variable>
            </variables>
        </fact>
    <fact>
        <name>vehicle</name>
        <full_name>Vehicle</full_name>
        <variables>
            <variable>
                <name>id</name>
                <proper_name>Vehicle Instance</proper_name>
                <type>INT</type>
            </variable>
            <variable>
                <name>crash_id</name>
                    <proper_name>Crash Instance</proper_name>
                <type>INT</type>
            </variable>
        </variables>
    </fact>
</facts>

我想从节点中提取标签的所有内容,但仅限于Crash事实。

到目前为止,这是我的代码。

def header(filename, fact):    
    lst = []
    tree = ET.parse(filename) #read in the XML
    for fact in tree.iter(tag = 'fact'):
        factname = fact.find('name').text
        if factname == fact: #choose the fact to pull from
            for var in fact.iter(tag = 'variable'):
                name = var.find('name').text
                lst.append(name)
     return lst #return a list of all the <name> tags from the Crash fact

newlst = header('schema.xml','crash')

我的输出newlst应该是Crash事实中所有标签的列表。 但它一直空着。

奇怪的是,如果我对所有内容进行硬编码(并删除函数),它会返回正确的输出:

lst = []
tree = ET.parse('schema.xml')
for fact in tree.iter(tag = 'fact'):
    factname = fact.find('name').text
    if factname == 'crash': 
        for var in fact.iter(tag = 'variable'):
            name = var.find('name').text
            lst.append(name)
 print(lst)


 Output: ['id',
 'accident_key',
 'accident_year']

在函数中,您将变量fact用作参数,并将第一个for循环变量。 试试这个版本:

def header(filename, target_factname):    
    lst = []
    tree = ET.parse(filename) #read in the XML
    for fact in tree.iter(tag = 'fact'):
        factname = fact.find('name').text
        if factname == target_factname: #choose the fact to pull from
            for var in fact.iter(tag = 'variable'):
                name = var.find('name').text
                lst.append(name)
     return lst #return a list of all the <name> tags from the Crash fact

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM