繁体   English   中英

如何用字符串替换lxml中的元素

[英]How do I replace an element in lxml with a string

我试图在lxml和python中找出如何用字符串替换元素。

在实验中,我有以下代码:

from lxml import etree as et

docstring = '<p>The value is permitted only when that includes <xref linkend=\"my linkend\" browsertext=\"something here\" filename=\"A_link.fm\"/>, otherwise the value is reserved.</p>'

topicroot = et.XML(docstring)
topicroot2 = et.ElementTree(topicroot) 
xref = topicroot2.xpath('//*/xref')
xref_attribute = xref[0].attrib['browsertext']

print href_attribute

结果是:“这里有东西”

这是我在此小样本中寻找的浏览器文本属性。 但是我似乎无法弄清楚的是如何用我在这里捕获的属性文本替换整个元素。

(我确实知道在我的示例中我可能有多个外部参照,因此需要构造一个循环才能正确地通过它们。)

这样做的最佳方法是什么?

对于那些想知道的人,我必须这样做是因为该链接实际上指向的文件由于我们的构建系统不同而不存在。

提前致谢!

试试这个(Python 3):

from lxml import etree as et

docstring = '<p>The value is permitted only when that includes <xref linkend=\"my linkend\" browsertext=\"something here\" filename=\"A_link.fm\"/>, otherwise the value is reserved.</p>'

# Get the root element.
topicroot = et.XML(docstring)
topicroot2 = et.ElementTree(topicroot)

# Get the text of the root element. This is a list of strings!
topicroot2_text = topicroot2.xpath("text()")

# Get the xref elment.
xref = topicroot2.xpath('//*/xref')[0]
xref_attribute = xref.attrib['browsertext']

# Save a reference to the p element, remove the xref from it.
parent = xref.getparent()
parent.remove(xref)

# Set the text of the p element by combining the list of string with the
# extracted attribute value.
new_text = [topicroot2_text[0], xref_attribute, topicroot2_text[1]]
parent.text = "".join(new_text)

print(et.tostring(topicroot2))

输出:

b'<p>The value is permitted only when that includes something here, otherwise the value is reserved.</p>'

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM