[英]Search Text and replace with lxml
I need to search text in multi-line XML file where I have multiple tags. 我需要在具有多个标签的多行XML文件中搜索文本。 My XML file looks like this
我的XML文件如下所示
<?xml version="1.0" encoding="utf-8"?>
<nc:data xmlns:nc="urn:ietf:params:xml:ns:netconf:base:1.0">
<system xmlns="http://www.abc.xyz">
<context>
<name>context_1</name>
<host>
<name>xyz</name>
<tag1>
<name>pqr</name>
<role>s1</role>
<tag2>test</tag2>
</tag1>
<tag2>
<name>pqr</name>
<role>s1</role>
<tag2>test</tag2>
</tag2>
</host>
</context>
</system>
</nc:data>
I want to search appearances of text "test"
in the XML file and list their parent tag in the output. 我想在XML文件中搜索文本
"test"
外观,并在输出中列出其父标记。 Unfortunately I am unable to do so. 不幸的是我无法这样做。
The Python code that I have written is : 我写的Python代码是:
import os
import xml
import sys
from xml.dom import minidom
import xml.etree.ElementTree as ET
def xml_parsing():
'''
with open('file.xml', 'rt') as f:
tree = ET.parse(f)
for node in tree.findall('.//context'):
print node, node.tag, node.attrib
url = node.attrib.get('tag1')
print url
xml_parsing()
I am getting blank result as output and unable to do anything beyond it. 我得到的结果是空白,无法执行超出其范围的任何操作。 I have tried both
ElementTree
and lxml
. 我已经尝试了
ElementTree
和lxml
。 I believe it has something to do with the search pattern that I am trying to find using findall
. 我相信这与我尝试使用
findall
查找的搜索模式有关。
Please advise with your expert comments what should be tried now. 请告知您的专家意见,现在应该尝试什么。
I tried the SAX way as well and code is like this: 我也尝试了SAX方式,并且代码是这样的:
xmldoc = minidom.parse('file.xml')
reflist = xmldoc.getElementsByTagName('tag1')
print reflist[0].toxml()
But this returns me the complete line other than just the value between tags. 但这返回了我完整的一行,而不仅仅是标签之间的值。
XPath expression to find element, regardless of the element name and location in the XML document, having text value equals test
is //*[text()='test']
or alternatively //*[.='test']
. 无论元素名称和XML文档中的位置如何,用于查找元素的XPath表达式的文本值等于
test
都是//*[text()='test']
或//*[.='test']
。
Consider the following working lxml
example that demonstrate finding such elements and update the value : 考虑下面的
lxml
工作示例,该示例演示如何找到此类元素并更新值:
from lxml import etree as ET
xml = '''<?xml version="1.0" encoding="utf-8"?>
<nc:data xmlns:nc="urn:ietf:params:xml:ns:netconf:base:1.0">
<system xmlns="http://www.abc.xyz">
<context>
<name>context_1</name>
<host>
<name>xyz</name>
<tag1>
<name>pqr</name>
<role>s1</role>
<tag2>test</tag2>
</tag1>
<tag2>
<name>pqr</name>
<role>s1</role>
<tag2>test</tag2>
</tag2>
</host>
</context>
</system>
</nc:data>'''
tree = ET.fromstring(xml)
for node in tree.xpath("//*[.='test']"):
#update node value with new text 'foo'
node.text = 'foo'
print ET.tostring(node)
output : 输出:
<tag2 xmlns="http://www.abc.xyz" xmlns:nc="urn:ietf:params:xml:ns:netconf:base:1.0">foo</tag2>
<tag2 xmlns="http://www.abc.xyz" xmlns:nc="urn:ietf:params:xml:ns:netconf:base:1.0">foo</tag2>
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.