[英]Change specific repeating element in .xml using Python
我有以下.xml文件,我喜欢操作:
<html>
<A>
<B>
<C>
<D>
<TYPE>
<NUMBER>7297</NUMBER>
<DATA />
</TYPE>
<TYPE>
<NUMBER>7721</NUMBER>
<DATA>A=1,B=2,C=3,</DATA>
</TYPE>
</D>
</C>
</B>
</A>
</html>
我想更改位于<NUMBER>7721</NUMBER>
元素下的<NUMBER>7721</NUMBER>
<DATA>
内的文本。 我怎么做? 如果我使用find()
或findtext()
它只能指向第一个匹配。
XPath非常适合这种东西。 //TYPE[NUMBER='7721' and DATA]
将找到所有TYPE节点,这些节点至少有一个带有文本'7721'的NUMBER子节点和至少一个DATA子节点:
from lxml import etree
xmlstr = """<html>
<A>
<B>
<C>
<D>
<TYPE>
<NUMBER>7297</NUMBER>
<DATA />
</TYPE>
<TYPE>
<NUMBER>7721</NUMBER>
<DATA>A=1,B=2,C=3,</DATA>
</TYPE>
</D>
</C>
</B>
</A>
</html>"""
html_element = etree.fromstring(xmlstr)
# find all the TYPE nodes that have NUMBER=7721 and DATA nodes
type_nodes = html_element.xpath("//TYPE[NUMBER='7721' and DATA]")
# the for loop is probably superfluous, but who knows, there might be more than one!
for t in type_nodes:
d = t.find('DATA')
# example: append spamandeggs to the end of the data text
if d.text is None:
d.text = 'spamandeggs'
else:
d.text += 'spamandeggs'
print etree.tostring(html_element)
输出:
<html>
<A>
<B>
<C>
<D>
<TYPE>
<NUMBER>7297</NUMBER>
<DATA/>
</TYPE>
<TYPE>
<NUMBER>7721</NUMBER>
<DATA>A=1,B=2,C=3,spamandeggs</DATA>
</TYPE>
</D>
</C>
</B>
</A>
</html>
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.