简体   繁体   中英

PHP - Remove Special character in xml

How to remove the Special characters between open and closing XML?

I have tried using recursion function. So it doesn't work for me in this case.

$sampleXML = '<?xml version="1.0" encoding="ISO-8859-1"?>  
<mainTag type="user">
<note>
    <PersonName>
        <GivenName>Replace & this</GivenName>
        <MiddleName>Replace < this</MiddleName>
    </PersonName>
    <Aliases>
        <PersonName>
            <GivenName>Replace > this</GivenName>
            <FamilyName>Replace " this</FamilyName>
        </PersonName>
    </Aliases>
    <DemographicDetail>
        <GovernmentId countryCode="US">testIDs data  </GovernmentId>
        <DateOfBirth>2000-12-12</DateOfBirth>
    </DemographicDetail>
</note>
<anothertag>
    <data type="credit">
        <Vendor score="yes"> vendor name  </Vendor>
    </data>
</anothertag>
</mainTag>';


$doc = new DOMDocument;
$doc->loadXML($xml);
$this->removeSpecialCharacterNodes($doc);
$xpath = new DOMXpath($doc);
$xml = $doc->saveXML($doc, LIBXML_NOEMPTYTAG);

Replace the below content

 & by &amp;
 > by &lt;
 < by &gt;
" by &quot;
' by  &apos;

I have used the below recursion code but it return empty value

public function removeSpecialCharacterNodes(DOMNode $node) {
        // echo "aa";
        // var_dump($node->childNodes);
        $str = $node->childNodes;
        var_dump($node->childNodes);
        foreach ($node->childNodes as $child){
          if($child->hasChildNodes()) {
            $this->removeSpecialCharacterNodes($child);
          } else{
                $child->nodeValue = str_ireplace('&', '&amp;', $child->nodeValue);
          }
        }    
    }

Update: I have used the string replace and htmlspecialchars still special character are not updated.

$doc = new DOMDocument;
$doc->loadXML( $sampleXML);

foreach ($doc->documentElement->childNodes as $node) {
    if($node->nodeType==1){
        $oldAddressLine = $node->getElementsByTagName('AddressLine')->Item(0);
        // $elle = str_ireplace(
        //  array( "'"),
        //  array( "&apos;"), 
        //  $oldAddressLine->nodeValue
        // );
        // $newelement = $doc->createElement('AddressLine', $elle); 
                
        $chk = $oldAddressLine->nodeValue;
        $newelement = $doc->createElement('AddressLine', htmlspecialchars( $chk, ENT_XML1 )); 

        if ($oldAddressLine->parentNode != null) {
           $oldAddressLine->parentNode->replaceChild($newelement, $oldAddressLine);
        }
    }
 }

 $xpath = new DOMXpath($doc);

 $finalVal = $doc->saveXML($doc, LIBXML_NOEMPTYTAG);

 echo "<pre>".htmlentities($finalVal)."</pre>"; exit;

The so called special characters have to be entities in xml. For this just encode those characters with htmlspecialchars() .

$value = htmlspecialchars( "Ben & Jerry 's", ENT_XML1 );

Since PHP 5.4 you can use:

htmlspecialchars($string, ENT_XML1); You should specify the encoding, such as:

htmlspecialchars($string, ENT_XML1, 'UTF-8'); Update Note that the above will only convert:

& to & < to <

to > If you want to escape text for use in an attribute enclosed in double quotes:

htmlspecialchars($string, ENT_XML1 | ENT_COMPAT, 'UTF-8'); will convert " to " in addition to &, < and >.

And if your attributes are enclosed in single quotes:

htmlspecialchars($string, ENT_XML1 | ENT_QUOTES, 'UTF-8'); will convert ' to ' in addition to &, <, > and ".

(Of course you can use this even outside of attributes).

See the manual entry for htmlspecialchars.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM