简体   繁体   中英

Changing bounding box coordinates in xml file as per new image width and height

I am trying to convert bounding box coordinates in xml file with respect to a new image's width and height. The sample xml file is given below:

<annotations>
 <image height="940" id="3" name="C_00080.jpg" width="1820">
  <box label="Objects" occluded="0" xbr="801.99255" xtl="777.78656" ybr="506.9955" ytl="481.82132">
   <attribute name="Class">B</attribute>
  </box>
  <box label="Objects" occluded="0" xbr="999.319" xtl="963.38654" ybr="519.2735" ytl="486.68628">
   <attribute name="Class">A</attribute>
  </box>
 </image>
<annotations>

Original image width and height in xml is 1820x940 and box coordinates are same. I want to change the box coordinates to a new image's width and height that is 1080x720 . I have written this code, can someone help me to verify or tell me a better way for the code below.

import xml.etree.ElementTree as ET

label_file = '1.xml'
tree = ET.parse(label_file)
root = tree.getroot()

for image in root.findall('image'):
    image.attrib['width'] = '1080'  # Original width = 1820
    image.attrib['height'] = '720'  # Original width = 940
    for allBboxes in image.findall('box'):
        xmin = float(allBboxes.attrib['xtl'])
        xminNew = float(xmin / (1820/1080))
        xminNew = float("{:.5f}".format(xminNew))
        allBboxes.attrib['xtl'] = str(xminNew)
        ymin = float(allBboxes.attrib['ytl'])
        yminNew = float(ymin / (940/720))
        yminNew = float("{:.5f}".format(yminNew))
        allBboxes.attrib['ytl'] = str(yminNew)
        xmax = float(allBboxes.attrib['xbr'])
        xmaxNew = float(xmax / (1820/1080))
        xmaxNew = float("{:.5f}".format(xmaxNew))
        allBboxes.attrib['xbr'] = str(xmaxNew)
        ymax = float(allBboxes.attrib['ybr'])
        ymaxNew = float(ymax / (940/720))
        ymaxNew = float("{:.5f}".format(ymaxNew))
        allBboxes.attrib['ybr'] = str(ymaxNew)

tree.write(label_file)

To improve the code you can:

  • compute the ratios before the loop
  • remove useless float conversions
  • remove the division (division by a division is a multiplication)
  • rounding of the float may not be necessary
  • group the statements in a coherent order
  • rename allBoxes to box as it represents only one box

Here is a possible code:

import xml.etree.ElementTree as ET

label_file = '1.xml'
tree = ET.parse(label_file)
root = tree.getroot()

r_w = 1080 / 1820
r_h = 720 / 940

for image in root.findall('image'):
    image.attrib['width'] = '1080'  # Original width = 1820
    image.attrib['height'] = '720'  # Original width = 940

    for box in image.findall('box'):
        xmin = float(box.attrib['xtl'])
        ymin = float(box.attrib['ytl'])
        xmax = float(box.attrib['xbr'])
        ymax = float(box.attrib['ybr'])

        xminNew = xmin * r_w
        yminNew = ymin * r_h
        xmaxNew = xmax * r_w
        ymaxNew = ymax * r_h

        box.attrib['xtl'] = str(xminNew)
        box.attrib['ytl'] = str(yminNew)
        box.attrib['xbr'] = str(xmaxNew)
        box.attrib['ybr'] = str(ymaxNew)

tree.write(label_file)

You can further improve this code by wrapping all this in functions to improve usability, clarity and possible reuse.

Consider a parameterized XSLT solution using Python's third party module, lxml , where you pass new width and height values from Python to dynamically apply formula to XML attributes.

XSLT (save as.xsl file)

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output indent="yes" encoding="utf-8"/>
  <xsl:strip-space elements="*"/>

  <!-- PARAMS WITH DEFAULTS -->
  <xsl:param name="new_width" select="1080"/>
  <xsl:param name="new_height" select="720"/>  

  <!-- IDENTITY TRANSFORM -->
  <xsl:template match="@*|node()">
    <xsl:copy>
      <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
  </xsl:template>

  <!-- WIDTH AND HEIGHT ATTRS CHANGE -->
  <xsl:template match="image">
    <xsl:copy>
      <xsl:apply-templates select="@*"/>
      <xsl:attribute name="width"><xsl:value-of select="$new_width"/></xsl:attribute>
      <xsl:attribute name="height"><xsl:value-of select="$new_height"/></xsl:attribute>
      <xsl:apply-templates select="node()"/>
    </xsl:copy>
  </xsl:template>

  <!-- X ATTRS CHANGE -->
  <xsl:template match="box/@xbr|box/@xtl">
      <xsl:variable select="ancestor::image/@width" name="curr_width"/>

      <xsl:attribute name="{name(.)}">
          <xsl:value-of select="format-number(. div ($curr_width div $new_width) , '#.00000')"/>
      </xsl:attribute>
  </xsl:template>

  <!-- Y ATTRS CHANGE -->
  <xsl:template match="box/@ybr|box/@ytl">
      <xsl:variable select="ancestor::image/@height" name="curr_height"/>

      <xsl:attribute name="{name(.)}">
          <xsl:value-of select="format-number(. div ($curr_height div $new_height), '#.00000')"/>
      </xsl:attribute>
  </xsl:template>

</xsl:stylesheet>

Python (no for loop or if logic)

import lxml.etree as et

# LOAD XML AND XSL SCRIPT
xml = et.parse('Input.xml')
xsl = et.parse('Script.xsl')

# PASS PARAMETERS TO XSLT
transform = et.XSLT(xsl)
result = transform(xml, new_width = et.XSLT.strparam(str(1080)), 
                        new_height = et.XSLT.strparam(str(720)))

# SAVE RESULT TO FILE
with open("Output.xml", 'wb') as f:
    f.write(result)

Output

<?xml version="1.0" encoding="utf-8"?>
<annotations>
  <image height="720" id="3" name="C_00080.jpg" width="1080">
    <box label="Objects" occluded="0" xbr="475.90767" xtl="461.54367" ybr="388.33698" ytl="369.05463">
      <attribute name="Class">B</attribute>
    </box>
    <box label="Objects" occluded="0" xbr="593.00248" xtl="571.67992" ybr="397.74140" ytl="372.78098">
      <attribute name="Class">A</attribute>
    </box>
  </image>
</annotations>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM