简体   繁体   中英

PHP and XMLWriter

I was successfully using the following code to merge multiple large XML files into a new (larger) XML file. Found at least part of this on StackOverflow

 $docList = new DOMDocument(); $root = $docList->createElement('documents'); $docList->appendChild($root); $doc = new DOMDocument(); foreach(xmlFilenames as $xmlfilename) { $doc->load($xmlfilename); $xmlString = $doc->saveXML($doc->documentElement); $xpath = new DOMXPath($doc); $query = self::getQuery(); // this is the name of the ROOT element $nodelist = $xpath->evaluate($query, $doc->documentElement); if( $nodelist->length > 0 ) { $node = $docList->importNode($nodelist->item(0), true); $xmldownload = $docList->createElement('document'); if (self::getShowFileName()) $xmldownload->setAttribute("filename", $filename); $xmldownload->appendChild($node); $root->appendChild($xmldownload); } } $newXMLFile = self::getNewXMLFile(); $docList->save($newXMLFile);

I started running into OUT OF MEMORY issues when the number of files grew as did the size of them.

I found an article here which explained the issue and recommended using XMLWriter

So, now trying to use PHP XMLWriter to merge multiple large XML files together into a new (larger) XML file. Later, I will execute xpath against the new file.

Code:

 $xmlWriter = new XMLWriter(); $xmlWriter->openMemory(); $xmlWriter->openUri('mynewFile.xml'); $xmlWriter->setIndent(true); $xmlWriter->startDocument('1.0', 'UTF-8'); $xmlWriter->startElement('documents'); $doc = new DOMDocument(); foreach($xmlfilenames as $xmlfilename) { $fileContents = file_get_contents($xmlfilename); $xmlWriter->writeElement('document',$fileContents); } $xmlWriter->endElement(); $xmlWriter->endDocument(); $xmlWriter->flush();

Well, the resultant (new) xml file is no longer correct since elements are escaped - ie <?xml version="1.0" encoding="UTF-8"?>

&lt;CONFIRMOWNX&gt;
&lt;Confirm&gt;
&lt;LglVeh id=&quot;GLE&quot;&gt;
&lt;AddrLine1&gt;GLEACHER &amp;amp; COMPANY&lt;/AddrLine1&gt;
&lt;AddrLine2&gt;DESCAP DIVISION&lt;/AddrLine2&gt;

Can anyone explain how to take the content from the XML file and write them properly to new file?

I'm burnt on this and I KNOW it'll be something simple I'm missing.

Thanks. Robert

See, the problem is that XMLWriter::writeElement is intended to, well, write a complete XML element. That's why it automatically sanitize (replace & with &amp; , for example) the contents of what's been passed to it as the second param.

One possible solution is to use XMLWriter::writeRaw method instead, as it writes the contents as is - without any sanitizing. Obviously it doesn't validate its inputs, but in your case it does not seem to be a problem (as you're working with already checked source).

Hmm, Not sure why it's converting it to HTML Characters, but you can decode it like so

htmlspecialchars_decode($data);

It converts special HTML entities back to characters.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM