简体   繁体   中英

How do I match SQL table rows with XML namespace elements?

I've successfully saved down this XML file to my server using the following PHP:

file_put_contents("test.xml", fopen("http://www.treasury.gov/resource-center/data-chart-center/interest-rates/pages/XmlView.aspx?data=yieldyear&year=2015", 'r'));

Now i'm trying to get it into my DB. Just trying to get it to work by executing SQL in the phpMyAdmin GUI. I successfully setup the following table:

CREATE TABLE `test` (
  `NEW_DATE` varchar(40) NOT NULL,
  `BC_1MONTH` int(11) NULL,
  `BC_3MONTH` int(11) NULL,
  PRIMARY KEY (`NEW_DATE`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

The following code runs without error, but just results in the addition of one empty row to my DB.

LOAD XML LOCAL INFILE 'test.xml'
INTO TABLE test
ROWS IDENTIFIED BY '<content>';

I see here that MySQL server looks for field names matching the column names of the target table. It is also mentioned that there is no requirement for every field in the XML file be matched with a column in the corresponding table. Fields which have no corresponding columns are skipped. Are my column names not matching the XML because of the XML prefixes and namespaces or am I barking up the wrong tree?

This question shows how to reference the same XML elements using PHP. Apparently prefix definition is required using registerXPathNamespace() in order to build an Xpath:

Do I need to build such an Xpath in SQL?

Perhaps I could remove the namespace/prefix data in PHP when I save the XML down using something like LIBXML_NSCLEAN ; apparently this removes redundant namespace declarations. What qualifies as redundant?

Another option for removing namespaces seems to be XSL stylesheets. Per this question.

What is the best approach to solving this issue?

Essentially, your XML is too complex (nested nodes, attributes, namespaces) to readily import into MySQL using LOAD XML . As your above link shows the statement supports only three different formats:

<row column1="value1" column2="value2" .../>

<row>
  <column1>value1</column1>
  <column2>value2</column2>
</row>

<row>
  <field name='column1'>value1</field>
  <field name='column2'>value2</field>
</row>

Therefore, you need to transform your raw XML into such format above aligned of course to your table's fields. Using XSLT can help tremendously. As information, XSLT is a special programming language that re-structures XML content to various forms for end-use needs. Like other general purpose languages including Java, C#, Python, and VB, PHP comes equipped with an XSLT 1.0 processor. You may need to enable the extension.

XSLT Script (save as .xsl or .xslt file)

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                              xmlns:doc="http://www.w3.org/2005/Atom"
                              xmlns:m="http://schemas.microsoft.com/ado/2007/08/dataservices/metadata"
                              xmlns:d="http://schemas.microsoft.com/ado/2007/08/dataservices"
                              exclude-result-prefixes="doc m d">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>

<!-- IDENTITY TRANSFORM -->
<xsl:template match="@*|node()">
    <xsl:copy>
        <xsl:apply-templates select="node()"/>
    </xsl:copy>
</xsl:template>

<!-- PARSING NEEDED CONTENT -->
<xsl:template match="doc:entry">
    <row>
        <NEW_DATE><xsl:value-of select="doc:content/m:properties/d:NEW_DATE"/></NEW_DATE>
        <BC_1MONTH><xsl:value-of select="doc:content/m:properties/d:BC_1MONTH"/></BC_1MONTH>
        <BC_3MONTH><xsl:value-of select="doc:content/m:properties/d:BC_3MONTH"/></BC_3MONTH>
    </row>
</xsl:template>

<!-- REMOVE UNNEEDED NODES -->
<xsl:template match="doc:title|doc:id|doc:update|doc:link|doc:updated"/>

</xsl:stylesheet>

PHP Script (loading and processing XML and XSL content)

$doc = new DOMDocument();

// PARSING DIRECTLY FROM WEB PAGE
$doc->load('http://www.treasury.gov/resource-center/data-chart-center/interest-rates/pages/XmlView.aspx?data=yieldyear&year=2015');

$xsl = new DOMDocument;
$xsl->load('XSLTScript.xsl');

// Configure the transformer
$proc = new XSLTProcessor;
$proc->importStyleSheet($xsl); 

// Transform XML source
$newXml = $proc->transformToXML($doc);

// Save output to file
$xmlfile = 'Output.xml';
file_put_contents($xmlfile, $newXml);

XML Output (now XML content can be used in LOAD XML to import into MySQL)

<?xml version="1.0" encoding="UTF-8"?>
<pre>
  <row>
    <NEW_DATE>2015-01-02T00:00:00</NEW_DATE>
    <BC_1MONTH>0.02</BC_1MONTH>
    <BC_3MONTH>0.02</BC_3MONTH>
  </row>
  <row>
    <NEW_DATE>2015-01-05T00:00:00</NEW_DATE>
    <BC_1MONTH>0.02</BC_1MONTH>
    <BC_3MONTH>0.03</BC_3MONTH>
  </row>
  <row>
    <NEW_DATE>2015-01-06T00:00:00</NEW_DATE>
    <BC_1MONTH>0.02</BC_1MONTH>
    <BC_3MONTH>0.03</BC_3MONTH>
  </row>
  ...
</pre>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM