[英]How do I replace an element in lxml with a string
我試圖在lxml和python中找出如何用字符串替換元素。
在實驗中,我有以下代碼:
from lxml import etree as et
docstring = '<p>The value is permitted only when that includes <xref linkend=\"my linkend\" browsertext=\"something here\" filename=\"A_link.fm\"/>, otherwise the value is reserved.</p>'
topicroot = et.XML(docstring)
topicroot2 = et.ElementTree(topicroot)
xref = topicroot2.xpath('//*/xref')
xref_attribute = xref[0].attrib['browsertext']
print href_attribute
結果是:“這里有東西”
這是我在此小樣本中尋找的瀏覽器文本屬性。 但是我似乎無法弄清楚的是如何用我在這里捕獲的屬性文本替換整個元素。
(我確實知道在我的示例中我可能有多個外部參照,因此需要構造一個循環才能正確地通過它們。)
這樣做的最佳方法是什么?
對於那些想知道的人,我必須這樣做是因為該鏈接實際上指向的文件由於我們的構建系統不同而不存在。
提前致謝!
試試這個(Python 3):
from lxml import etree as et
docstring = '<p>The value is permitted only when that includes <xref linkend=\"my linkend\" browsertext=\"something here\" filename=\"A_link.fm\"/>, otherwise the value is reserved.</p>'
# Get the root element.
topicroot = et.XML(docstring)
topicroot2 = et.ElementTree(topicroot)
# Get the text of the root element. This is a list of strings!
topicroot2_text = topicroot2.xpath("text()")
# Get the xref elment.
xref = topicroot2.xpath('//*/xref')[0]
xref_attribute = xref.attrib['browsertext']
# Save a reference to the p element, remove the xref from it.
parent = xref.getparent()
parent.remove(xref)
# Set the text of the p element by combining the list of string with the
# extracted attribute value.
new_text = [topicroot2_text[0], xref_attribute, topicroot2_text[1]]
parent.text = "".join(new_text)
print(et.tostring(topicroot2))
輸出:
b'<p>The value is permitted only when that includes something here, otherwise the value is reserved.</p>'
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.