Append new elements to XML

Question

I have base XML, to which I would like to add new elements. But it fails and I cannon understand why.

My base XML:

<?xml version="1.0" encoding="utf-8"?>
<vehicleDefinitions>
    <vehicleType id="bus">
        <capacity>
            <seats persons="3"/>
            <standingRoom persons="9"/>
        </capacity>
        <length meter="12.3"/>
        <width meter="2.5"/>
        <accessTime secondsPerPerson="0.5"/>
        <egressTime secondsPerPerson="0.5"/>
        <doorOperation mode="serial"/>
        <passengerCarEquivalents pce="0.28"/>
    </vehicleType>
</vehicleDefinitions>

My code:

from lxml import etree

schedule = etree.parse('schedule_mapped.xml') #I use this file to get data from it
vehicles = etree.parse('vehicles.xml') #I'm reading my base XML
vehicles_root = vehicles.getroot() #Getting its root
for transitLine in schedule.findall('transitLine'):
    tstype = transitLine.find('transitRoute').find('transportMode').text
    for transitRoute in transitLine.findall('transitRoute'):
        for departure in transitRoute.find('departures').findall('departure'):
            tsname = departure.get('vehicleRefId')
            vehicle = etree.SubElement(vehicles_root, 'vehicle') #I want to add a child to my root element
            vehicle.attrib['id'] = tsname
            vehicle.attrib['type'] = tstype

The structure of my output XML is correct. I mean that children are added:

But after writing XML to file

with open(ts.replace('schedule', 'vehicles'), 'wb') as f:
        f.write(etree.tostring(vehicles,pretty_print=True,encoding='utf8'))

I got this

I discovered that the problem might be in unreadable characters from the base XML but I do not know how to cope this.

Answer 1

Consider also XSLT , the special-purpose language designed to transform XML files, which can retrieve nodes from a different XML file using document() function. Additionally, you have better control of output including indentation and line breaks, headers, etc. Python's lxml can run XSLT 1.0 scripts. Doing so you avoid any application layer nested looping.

XSLT (save as.xsl file, to be used in Python below)

Notice reference to other.xml file. Both XML files are assumed to be in same directory.

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output indent="yes" encoding="UTF-8"/>
  <xsl:strip-space elements="*"/>

  <xsl:template match="vehicleDefinitions">
    <xsl:copy>
        <xsl:copy-of select="vehicleType"/>
        <xsl:for-each select="document('schedule_mapped.xml')/descendant::departure">
          <vehicle id="{@vehicleRefId}" 
                   type="{../preceding-sibling::transportMode}"/>
        </xsl:for-each>
    </xsl:copy>
  </xsl:template>
    
</xsl:stylesheet>

Python

from lxml import etree

doc = etree.parse('vehicle.xml')
xsl = etree.parse('script.xsl')

transformer = etree.XSLT(xsl)
result = transformer(doc)

with open('Output.xml', 'wb') as f:
    f.write(result)

Answer 2

So, finally I found a solution. We can just parse XML without blank characters. It allows "pretty print" to work correctly.

def getClean(filename):
        parser = etree.XMLParser(remove_blank_text=True)
        cleanTree = etree.parse(filename, parser)
        return cleanTree

Append new elements to XML

Question

2 answers

solution1
0 ACCPTED 2021-01-26 21:05:53

solution2
0 ACCPTED 2021-10-26 10:57:25

Append new elements to XML

Question

2 answers

solution1 0 ACCPTED 2021-01-26 21:05:53

solution2 0 ACCPTED 2021-10-26 10:57:25

solution1
0 ACCPTED 2021-01-26 21:05:53

solution2
0 ACCPTED 2021-10-26 10:57:25