简体   繁体   中英

Remove empty tags in XML using Regex ABAP

I have a problem with generating XML. I used Simple Transformation. Many of tags in my XML are empty. I found an information that I can get rid of those tags using Regex but it doesn't work perfectly. Let me show you how it looks.

Without Regex:

 <?xml version="1.0" encoding="utf-8" ?> 
<Invoice 
xmlns="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2" 
xmlns:cac="urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2" 
xmlns:cbc="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2" 
xmlns:xs="http://www.w3.org/2001/XMLSchema"
>
     <cbc:DueDate /> 
     <cbc:InvoiceTypeCode>380</cbc:InvoiceTypeCode> 
     <cbc:Note /> 
     <cbc:DocumentCurrencyCode>PLN</cbc:DocumentCurrencyCode> 
     <cbc:TaxCurrencyCode /> 
     <cbc:BuyerReference /> 
     <cac:InvoicePeriod>
      <cbc:StartDate /> 
      <cbc:EndDate /> 
      <cbc:DescriptionCode /> 
     </cac:InvoicePeriod>

Regex written in ABAP:

      REPLACE ALL OCCURRENCES OF REGEX
    '(<!\[CDATA\[([^]]|(\][^]])|(\]\][^>]))*\]\]>)|(<([^?][^><\s]*)(\s[^><]+)?/>)'
      IN exportxml
      WITH '$1'.

After using Regex:

      <cbc:InvoiceTypeCode>380</cbc:InvoiceTypeCode> 
      <cbc:DocumentCurrencyCode>PLN</cbc:DocumentCurrencyCode> 
      <cac:InvoicePeriod />

SimpleTransformation looks like this:

<?sap.transform simple?>
<tt:transform xmlns:tt="http://www.sap.com/transformation-templates" xmlns:ddic="http://www.sap.com/abapxml/types/dictionary" xmlns:def="http://www.sap.com/abapxml/types/defined">
  <tt:root name="ZXT_INVOICE" type="ddic:ZXT_INVOICE"/>
  <tt:template>
    <Invoice
xmlns:cbc="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2"
xmlns:cac="urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2"
xmlns:ccts="urn:un:unece:uncefact:documentation:2" 
xmlns:qdt="urn:oasis:names:specification:ubl:schema:xsd:QualifiedDatatypes-2" xmlns:udt="urn:un:unece:uncefact:data:specification:UnqualifiedDataTypesSchemaModule:2" 
xmlns:xs="http://www.w3.org/2001/XMLSchema" 
xmlns="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2"
>
      <cbc:DueDate tt:value-ref=".ZXT_INVOICE.DUEDATE"/>
      <cbc:InvoiceTypeCode tt:value-ref=".ZXT_INVOICE.INVOICETYPECODE"/>
      <cbc:Note tt:value-ref=".ZXT_INVOICE.NOTE"/>
      <cbc:DocumentCurrencyCode tt:value-ref=".ZXT_INVOICE.DOCUMENTCURRENCYCODE"/>
      <cbc:TaxCurrencyCode tt:value-ref=".ZXT_INVOICE.TAXCURRENCYCODE"/>
      <cbc:AccountingCost tt:value-ref=".ZXT_INVOICE.ACCOUNTINGCOST"/>
      <cbc:BuyerReference tt:value-ref=".ZXT_INVOICE.BUYERREFERENCE"/>
      <cac:InvoicePeriod>
        <cbc:StartDate tt:value-ref=".ZXT_INVOICE.INVOICE_PERIOD.STARTDATE"/>
        <cbc:EndDate tt:value-ref=".ZXT_INVOICE.INVOICE_PERIOD.ENDDATE"/>
        <cbc:DescriptionCode tt:value-ref=".ZXT_INVOICE.INVOICE_PERIOD.DESCRIPTIONCODE"/>
      </cac:InvoicePeriod>
    </Invoice>
  </tt:template>
</tt:transform>

Regex removes simple elements, but has a problem with nested elements like <cac:InvoicePeriod> . In my program I have many nested elements.. Can you help me modify regex or find another solution?

Thanks for any help.

Your ABAP regex literal :

(<!\[CDATA\[([^]]|(\][^]])|(\]\][^>]))*\]\]>)|(<([^?][^><\s]*)(\s[^><]+)?/>)

could be corrected and simplified this way :

(<!\[CDATA\[(?!\]\]>).*\]\]>)|<[^?!](?:(?!>|\/>).)*\/>

NB: (?!xyz). is a Negated preview condition, it means any character ( . ) provided that it's not a x followed with yz .

Remove empty xml elements recursively with XSLT Solution:

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>

<xsl:template match="*[descendant::text() or descendant-or-self::*/@*[string()]]">
    <xsl:copy>
        <xsl:apply-templates select="node()|@*"/>
    </xsl:copy>
</xsl:template>

<xsl:template match="@*[string()]">
    <xsl:copy/>
</xsl:template>

</xsl:stylesheet>

Ref links: 1 2

For me works perfectly. Thanks for help.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM