简体   繁体   中英

Python Manipulate and save XML without third-party libraries

I have a xml in which I have to search for a tag and replace the value of tag with a new values. For example,

<tag-Name>oldName</tag-Name>  

and replace oldName to newName like

<tag-Name>newName</tag-Name>    

and save the xml. How can I do that without using thirdparty lib like BeautifulSoup

Thank you

The best option from the standard lib is (I think) the xml.etree package.

Assuming that your example tag occurs only once somewhere in the document:

import xml.etree.ElementTree as etree
# or for a faster C implementation
# import xml.etree.cElementTree as etree

tree = etree.parse('input.xml')
elem = tree.find('//tag-Name') # finds the first occurrence of element tag-Name
elem.text = 'newName'
tree.write('output.xml')

Or if there are multiple occurrences of tag-Name, and you want to change them all if they have "oldName" as content:

import xml.etree.cElementTree as etree

tree = etree.parse('input.xml')
for elem in tree.findall('//tag-Name'):
    if elem.text == 'oldName':
        elem.text = 'newName'
# some output options for example
tree.write('output.xml', encoding='utf-8', xml_declaration=True)

Python has 'builtin' libraries for working with xml. For this simple task I'd look into minidom. You can find the docs here:

http://docs.python.org/library/xml.dom.minidom.html

If you are certain, I mean, completely 100% positive that the string <tag-Name> will never appear inside that tag and the XML will always be formatted like that, you can always use good old string manipulation tricks like:

xmlstring = xmlstring.replace('<tag-Name>oldName</tag-Name>', '<tag-Name>newName</tag-Name>')

If the XML isn't always as conveniently formatted as <tag>value</tag> , you could write something like:

a = """<tag-Name>


oldName


    </tag-Name>"""

def replacetagvalue(xmlstring, tag, oldvalue, newvalue):
    start = 0
    while True:
        try:
            start = xmlstring.index('<%s>' % tag, start) + 2 + len(tag)
        except ValueError:
            break
        end = xmlstring.index('</%s>' % tag, start)
        value = xmlstring[start:end].strip()
        if value == oldvalue:
            xmlstring = xmlstring[:start] + newvalue + xmlstring[end:]
    return xmlstring

print replacetagvalue(a, 'tag-Name', 'oldName', 'newName')

Others have already mentioned xml.dom.minidom which is probably a better idea if you're not absolutely certain beyond any doubt that your XML will be so simple. If you can guarantee that, though, remember that XML is just a big chunk of text that you can manipulate as needed.

I use similar tricks in some production code where simple checks like if "somevalue" in htmlpage are much faster and more understandable than invoking BeautifulSoup, lxml, or other *ML libraries.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM