简体   繁体   English

使用python替换xml标签内容

[英]Replace xml tag contents using python

I have a xml file with some data. 我有一个包含一些数据的xml文件。

<Emp>
<Name>Raja</Name>
<Location>
     <city>ABC</city>
     <geocode>123</geocode>
     <state>XYZ</state> 
</Location>
<sal>100</sal>
<type>temp</type> 
</Emp>

so the location information are wrong in the xml file, I have to replace it. 因此xml文件中的位置信息有误,我必须将其替换。

I have constructed the location information with corrected vales in python. 我在python中用更正的值构造了位置信息。

variable = '''
    <Location isupdated=1>
         <city>MyCity</city>
         <geocode>10.12</geocode>
         <state>MyState</state> 
    </Location>'''

So, the location tag should be replaced with the new information. 因此,应该用新信息替换位置标签。 Is there any simple way to update this in python. 有没有简单的方法可以在python中更新它。

I want the final result data like, 我想要最终结果数据,例如

<Emp>
<Name>Raja</Name>
<Location isupdated=1>
         <city>MyCity</city>
         <geocode>10.12</geocode>
         <state>MyState</state>
</Location>
<sal>100</sal>
<type>temp</type> 
</Emp>

Any thoughts ?? 有什么想法吗 ??

Thanks. 谢谢。

UPDATE - XML PARSER IMPLEMENTATION : since replace a specific <Location> tag require to modify the regex i'm providing a more general and safer alternative implementation based upon ElementTree parser (as stated above by @stribizhev and @Saket Mittal). 更新 -XML解析器的实现:由于替换特定的<Location>标记需要修改正则表达式,因此我基于ElementTree解析器提供了一个更通用,更安全的替代实现(如@stribizhev和@Saket Mittal所述)。

I've to add a root element <Emps> (to make a valid xml doc, requiring root element), i've also chosen to filter the location to edit by the <city> tag (but may be everyfield): 我必须添加一个根元素<Emps> (以创建一个有效的xml文档,需要根元素),我还选择通过<city>标记过滤要编辑的位置(但可能是Everyfield):

#!/usr/bin/python
# Alternative Implementation with ElementTree XML Parser

xml = '''\
<Emps>
    <Emp>
        <Name>Raja</Name>
        <Location>
            <city>ABC</city>
            <geocode>123</geocode>
            <state>XYZ</state>
        </Location>
        <sal>100</sal>
        <type>temp</type>
    </Emp>
    <Emp>
        <Name>GsusRecovery</Name>
        <Location>
            <city>Torino</city>
            <geocode>456</geocode>
            <state>UVW</state>
        </Location>
        <sal>120</sal>
        <type>perm</type>
    </Emp>
</Emps>
'''

from xml.etree import ElementTree as ET
# tree = ET.parse('input.xml')  # decomment to parse xml from file
tree = ET.ElementTree(ET.fromstring(xml))
root = tree.getroot()

for location in root.iter('Location'):
    if location.find('city').text == 'Torino':
        location.set("isupdated", "1")
        location.find('city').text = 'MyCity'
        location.find('geocode').text = '10.12'
        location.find('state').text = 'MyState'

print ET.tostring(root, encoding='utf8', method='xml')
# tree.write('output.xml') # decomment if you want to write to file

Online runnable version of the code here 代码的可运行在线版本在这里

PREVIOUS REGEX IMPLEMENTATION 以前的正则表达式的实现

This is a possible implementation using the lazy modifier .*? 这是使用惰性修饰符.*?的可能实现.*? and dot all (?s) : 并点所有(?s)

#!/usr/bin/python

import re

xml = '''\
<Emp>
<Name>Raja</Name>
<Location>
     <city>ABC</city>
     <geocode>123</geocode>
     <state>XYZ</state>
</Location>
</Emp>'''

locUpdate = '''\
    <Location isupdated=1>
         <city>MyCity</city>
         <geocode>10.12</geocode>
         <state>MyState</state>
    </Location>'''

output = re.sub(r"(?s)<Location>.*?</Location>", r"%s" % locUpdate, xml)

print output

You can test the code online here 您可以在此处在线测试代码

Caveat : if there are more than one <Location> tag in the xml input the regex replace them all with locUpdate . 注意 :如果xml输入中有多个<Location>标记,则正则表达式将它们全部替换为locUpdate You have to use: 您必须使用:

# (note the last ``1`` at the end to limit the substitution only to the first occurrence)
output = re.sub(r"(?s)<Location>.*?</Location>", r"%s" % locUpdate, xml, 1)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM