搜索文本并替换为lxml

Question

I need to search text in multi-line XML file where I have multiple tags. 我需要在具有多个标签的多行XML文件中搜索文本。 My XML file looks like this 我的XML文件如下所示

<?xml version="1.0" encoding="utf-8"?>
<nc:data xmlns:nc="urn:ietf:params:xml:ns:netconf:base:1.0">
    <system xmlns="http://www.abc.xyz">
      <context>
            <name>context_1</name>
            <host>
                <name>xyz</name>
                <tag1>
                    <name>pqr</name>
                    <role>s1</role>
                    <tag2>test</tag2>
                </tag1>
                <tag2>
                    <name>pqr</name>
                    <role>s1</role>
                    <tag2>test</tag2>
                </tag2>              
            </host>
      </context>
    </system>
</nc:data>

I want to search appearances of text "test" in the XML file and list their parent tag in the output. 我想在XML文件中搜索文本"test"外观，并在输出中列出其父标记。 Unfortunately I am unable to do so. 不幸的是我无法这样做。

The Python code that I have written is : 我写的Python代码是：

import os
import xml 
import sys 
from xml.dom import minidom
import xml.etree.ElementTree as ET

def xml_parsing():
    ''' 
    with open('file.xml', 'rt') as f:
        tree = ET.parse(f)
        for node in tree.findall('.//context'):
            print node, node.tag, node.attrib
            url = node.attrib.get('tag1')
            print url 

xml_parsing()

I am getting blank result as output and unable to do anything beyond it. 我得到的结果是空白，无法执行超出其范围的任何操作。 I have tried both ElementTree and lxml . 我已经尝试了ElementTree和lxml 。 I believe it has something to do with the search pattern that I am trying to find using findall . 我相信这与我尝试使用findall查找的搜索模式有关。

Please advise with your expert comments what should be tried now. 请告知您的专家意见，现在应该尝试什么。

I tried the SAX way as well and code is like this: 我也尝试了SAX方式，并且代码是这样的：

xmldoc = minidom.parse('file.xml')
reflist = xmldoc.getElementsByTagName('tag1')
print reflist[0].toxml()

But this returns me the complete line other than just the value between tags. 但这返回了我完整的一行，而不仅仅是标签之间的值。

Answer 1

XPath expression to find element, regardless of the element name and location in the XML document, having text value equals test is //*[text()='test'] or alternatively //*[.='test'] . 无论元素名称和XML文档中的位置如何，用于查找元素的XPath表达式的文本值等于test都是//*[text()='test']或//*[.='test'] 。

Consider the following working lxml example that demonstrate finding such elements and update the value : 考虑下面的lxml工作示例，该示例演示如何找到此类元素并更新值：

from lxml import etree as ET

xml = '''<?xml version="1.0" encoding="utf-8"?>
<nc:data xmlns:nc="urn:ietf:params:xml:ns:netconf:base:1.0">
    <system xmlns="http://www.abc.xyz">
      <context>
            <name>context_1</name>
            <host>
                <name>xyz</name>
                <tag1>
                    <name>pqr</name>
                    <role>s1</role>
                    <tag2>test</tag2>
                </tag1>
                <tag2>
                    <name>pqr</name>
                    <role>s1</role>
                    <tag2>test</tag2>
                </tag2>              
            </host>
      </context>
    </system>
</nc:data>'''

tree = ET.fromstring(xml)
for node in tree.xpath("//*[.='test']"):
    #update node value with new text 'foo'
    node.text = 'foo'
    print ET.tostring(node)

output : 输出：

<tag2 xmlns="http://www.abc.xyz" xmlns:nc="urn:ietf:params:xml:ns:netconf:base:1.0">foo</tag2>

<tag2 xmlns="http://www.abc.xyz" xmlns:nc="urn:ietf:params:xml:ns:netconf:base:1.0">foo</tag2>

搜索文本并替换为lxml

问题描述

1 个解决方案

解决方案1
1 2015-07-27 13:25:36

搜索文本并替换为lxml

问题描述

1 个解决方案

解决方案1 1 2015-07-27 13:25:36

解决方案1
1 2015-07-27 13:25:36