简体   繁体   中英

How to remove tag in lxml?

Firstly I followed this question , but I still have issues with the remove method.

tag.getparent().remove(tag)

I used this piece of code for removing anchor tag in question with attributes name="2" and id = "2" in this webpage

and when the line is executed I was still able to see the tag and its properties and when I iterate through all children I was still able to see the element which i deleted

What exactly does remove method does and why the tag which is deleted still persists?

This is the screenshot of the debugger after the line is executed.

在此处输入图片说明

When you remove a node from its parent the node itself still exists, but is simply detached from the parent. This allows you to append the "deleted" node to a different parent. But if you don't append the node to a new parent, then the node is a good as deleted from the perspective of the root node.

To preserve the children of the tag node being removed, you can prune them to the tag's parent at the same index like this:

parent = tag.getparent()
index = parent.index(tag)
for child in tag.getchildren()[::-1]: # in reverse order so that we can keep inserting at the same index while preserving the original order
    tag.remove(child)
    parent.insert(index, child)
parent.remove(tag)

Or you can simply use the drop_tag method:

tag.drop_tag()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM