简体   繁体   English

Beautifulsoup:稀烂的xml,单循环遍历每个项目

[英]Beautifulsoup: Soupy runny xml, single loop iterate through each item

Say you have some XML that is structured like this but could take any shape using these tag with the same tag names deeper and may be reused in weird ways: 假设您有一些XML的结构如下,但是使用这些标签时可以采用任何形状,且标签名称更深,并且可能以怪异的方式重复使用:

<a>
    <b>
        <c />
    </b>
    <b>
        <c />
    </b>
    <b>
        <b>
            <d>
                <b>
                    <e>
                        <f>
                            <c />
                        </f>
                    </e>
                </b>
                <b>
                    <e>
                        <f>
                            <c />
                        </f>
                    </e>
                </b>
            </d>
        </b>
    </b>
    <b>
        <b>
            <c />
        </b>
    </b>
</a>

I want to make it go through each of the tags one by one in the order they appear from top to bottom the repeated tags can be used in any order or structure. 我想让它们按从上到下的顺序逐一遍历每个标签,重复的标签可以按任何顺序或结构使用。 I want to go through each tag one by one using beautifulsoup. 我想使用beautifulsoup逐个浏览每个标签。 for example: 例如:

soup = BeautifulSoup(xmlcode, "xml")
for asd in soup.findAll(True, recursive=False):
    print asd.prettify()
    print "---------"

All this returns is a single large bs4.element.Tag. 所有这些返回都是一个大的bs4.element.Tag。 I would want it to return 19 lines instead in the order that they appear. 我希望它按出现的顺序返回19行。 Basically all I want to do is go over each single tag using hopefully a single loop or as few loops as possible. 基本上,我要做的就是希望使用单个循环或尽可能少的循环遍历每个单个标签。 Im open to better options than beautifulsoup if possible. 我可能会提供比beautifulsoup更好的选择。

You are looking for .children : 您正在寻找.children

xml_soup = BeautifulSoup(xml, "xml")
for tag in xml_soup.children:
    print tag

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM