简体   繁体   中英

Converting XML to CSV using xsltproc with complicated file

I've a big XML file that I am desperately trying to convert it into a CSV file with xsltproc .

All the data I wish to extract can be found under [GRP alias="TRIUT" level="5"] .

What I need are only these tags of the file:

  • Matricule
  • Name
  • Value of Mount1 from the "Rubrique" 976 of the ELEMENT_1

This is my XML:

<?xml version="1.0" encoding="UTF-8"?>
<RPT>
   <GRP alias="Reglementation" level="1">
      <FLD id="Reglementation">USA</FLD>
      <GRP alias="RUPT1" level="2">
         <FLD id="RUPT1" />
         <GRP alias="RUPT2" level="3">
            <FLD id="RUPT2" />
            <GRP alias="RUPT3" level="4">
               <FLD id="RUPT3" />
               <GRP alias="TRIUT" level="5">
                  <FLD id="TRIUT">00-532</FLD>
                  <DTL>
                     <FLD id="DateEdition" type="DATE">2017-02-01</FLD>
                     <FLD id="Name">MR CHARLIE CHAPLIN</FLD>
                     <FLD id="Matricule">12345678</FLD>
                     <SRPT id="ELEMENT_1">
                        <DTL>
                           <FLD id="Rubrique">038</FLD>
                           <FLD id="Mount1" type="FLOAT">2200.95</FLD>
                           <FLD id="Mount2" type="FLOAT">00000.00</FLD>
                        </DTL>
                        <DTL>
                           <FLD id="Rubrique">976</FLD>
                           <FLD id="Mount1">9926.96</FLD>
                           <FLD id="Mount2">00000.00</FLD>
                        </DTL>
                     </SRPT>
                  </DTL>
               </GRP>
               <GRP alias="TRIUT" level="5">
                  <FLD id="TRIUT">00186</FLD>
                  <DTL>
                     <FLD id="DateEdition">2017-03-31</FLD>
                     <FLD id="Nom">MR JAMES BOND</FLD>
                     <FLD id="Matricule">00000007</FLD>
                     <SRPT id="ELEMENT_1">
                        <DTL>
                           <FLD id="Rubrique">038</FLD>
                           <FLD id="Mount1">2054.22</FLD>
                           <FLD id="Mount2">000000.00</FLD>
                        </DTL>
                        <DTL>
                           <FLD id="Rubrique">976</FLD>
                           <FLD id="Mount1">2054.22</FLD>
                           <FLD id="Mount2">00000.22</FLD>
                        </DTL>
                     </SRPT>
                  </DTL>
               </GRP>
            </GRP>
         </GRP>
      </GRP>
   </GRP>
</RPT>

What I want to see is:

Matricule;Name;Rubrique976_Mount1
12345678;MR CHARLIE CHAPLIN;9926.96
00000007;MR JAMES BOND;2054.22

Do you think it's possible ?

This is what i tried to do, but it doesnt answer at all to what i need...

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
   <xsl:output method="text" encoding="UTF-8" />
   <xsl:strip-space elements="*" />
   <xsl:template match="/">
      <xsl:text>Matricule;Name;Rubrique976_Mount1</xsl:text>
      <xsl:text>&amp;#xA;</xsl:text>
      <xsl:for-each select="RPT/GRP/GRP/GRP/GRP/GRP/DTL">
         <xsl:for-each select="FLD">
            <xsl:value-of select="@id" />
            <xsl:text>;</xsl:text>
            <xsl:value-of select="." />
            <xsl:text>;</xsl:text>
            <xsl:for-each select="SRPT">
               <xsl:value-of select="@id" />
               <xsl:text>;</xsl:text>
               <xsl:value-of select="." />
               <xsl:text>;</xsl:text>
            </xsl:for-each>
         </xsl:for-each>
         <xsl:text>&amp;#xA;</xsl:text>
      </xsl:for-each>
   </xsl:template>
</xsl:stylesheet>

This is what I get, but not what I wanted...

Matricule;Name;Rubrique976_Mount1
DateEdition;2017-02-01;Name;MR CHARLIE CHAPLIN;Matricule;12345678;
DateEdition;2017-03-31;Nom;MR JAMES BOND;Matricule;00000007;

Thanks for people who want to rack their brain !

You should start off by selecting GRP which have the matching attributes

<xsl:apply-templates select="//GRP[@alias='TRIUT' and @level='5']" />

You can then have a template matching GRP which outputs the fields you need. For example, to output the "Matricule", it would look like this

<xsl:value-of select="DTL/FLD[@id='Matricule']" />

Outputting the "Value of Mount1 from the "Rubrique" 976 of the ELEMENT_1" is a bit more complicated because a number of conditions are involved:

<xsl:value-of select="DTL
                       /SRPT[@id='ELEMENT_1']
                        /DTL[FLD[@id='Rubrique']='976']
                         /FLD[@id='Mount1']" />

Try this XSLT

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

  <xsl:template match="/">
    <xsl:text>Matricule;Name;Rubrique976_Mount1&#10;</xsl:text>
    <xsl:apply-templates select="//GRP[@alias='TRIUT' and @level='5']" />
  </xsl:template>

  <xsl:template match="GRP">
    <xsl:value-of select="DTL/FLD[@id='Matricule']" />
    <xsl:text>,</xsl:text>
    <xsl:value-of select="DTL/FLD[@id='Name' or @id='Nom']" />
    <xsl:text>,</xsl:text>
    <xsl:value-of select="DTL/SRPT[@id='ELEMENT_1']/DTL[FLD[@id='Rubrique']='976']/FLD[@id='Mount1']" />
    <xsl:text>&#10;</xsl:text>
  </xsl:template>

</xsl:stylesheet>

EDIT: To sort by "Matricule", change the xsl:apply-templates to have an xsl:sort statement, like so:

<xsl:apply-templates select="//GRP[@alias='TRIUT' and @level='5']">
    <xsl:sort select="DTL/FLD[@id='Matricule']" />
</xsl:apply-templates>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM