简体   繁体   English

控制搜索深度findall Lxml

[英]Control search depth findall Lxml

I am beginner in Python and trying to Parse XML using LXML in python. 我是Python的初学者,并尝试使用python中的LXML解析XML。 I am trying to search a tag using finall() but want to have control of depth where I can search for tag but search doesn't go beyond one level. 我正在尝试使用finall()搜索标签,但希望控制深度,在这里我可以搜索标签,但搜索范围不超过一个层次。 Explaining below: 解释如下:

<?xml version='1.0' encoding='utf-8'?>
<system xmlns="some_name_space">
<a>
    <host>Random Name</host>
    <class>
        <name>Main_Tag_1</name>
        <detail>
            <name>Child_Tag_1</name>
            <ip>ip_1</ip>
            <port>port_1</port>
        <detail>
    </class>
    <class>
        <name>Main_Tag_2</name>
        <detail>
            <name>Child_Tag_2</name>
            <ip>ip_2</ip>
            <port>port_2</port>
        <detail>
    </class>
    <class>
        <name>Main_Tag_3</name>
        <detail>
            <name>Child_Tag_3</name>
            <ip>ip_3</ip>
            <port>port_3</port>
        <detail>
    </class>
</a>

I am using following python to find for all Main_tags sharing same tag-name as name . 我使用下面的Python找到所有Main_tags共享相同的标签名称为名称 I haven't added the complete program here. 我还没有在这里添加完整的程序。 But this function is a part of class. 但是此函数是类的一部分。

def name_ip_dict(self,filename):
self.tag_replace = {}
context = ET.iterparse(filename, tag='{some_name_space}class')
for action,elem in context:
    name_tag = elem.findall(".//{some_name_space}name")
    for name in name_tag: 
        print name.text
        for node in elem:
            ip_list = node.findall(".//{some_name_space}ip") 
            for ip in ip_node_list:
                self.tag_replace.setdefault(name.text, []).append(ip.text)

Right now, I am getting output as 现在,我得到的输出为

{'Main_Tag_1': ['ip_1'], 'Child_tag_1': ['ip_1'], 'Main_Tag_2': ['ip_2'], 'Child_tag_1': ['ip_2']} and so on..

But I just want to get First Parent ie Main_Tag1,2 or 3 and the text in ip tag. 但是我只想获取First Parent,即Main_Tag1,2或3,以及ip标签中的文本。

{'Main_Tag_1': ['ip_1'], 'Main_Tag_2': ['ip_2']} and so on..

This make me feel that there is a need to control depth of findall but I haven't been able to find out anything related to depth on web. 这使我感到有必要控制findall的深度,但是我无法找到与Web深度有关的任何内容。

Please let me know if there have already been such use cases and what is the best way to achieve this. 请让我知道是否已经存在此类用例,以及实现此用例的最佳方法是什么。

Use single slash ( / ) if you only want to search in direct child elements (not including grand-child and deeper descendant elements) : 如果只想搜索直接子元素(不包括孙子元素和更深的后代元素),请使用单斜杠( / ):

name_tag = elem.findall("./{some_name_space}name")

Just a heads up, when you need support for more advanced XPath expressions, use lxml 's xpath() method instead of findall() . 请注意,当您需要支持更高级的XPath表达式时,请使用lxmlxpath()方法而不是findall() The latter only support a very limited set of XPath expressions. 后者仅支持非常有限的XPath表达式集。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM