简体   繁体   English

编码特殊字符、DOMDocument XML 和 PHP

[英]Encoding special chars, DOMDocument XML and PHP

Using the following characters: " & ' < > £ for testing. My code builds an XML file using PHP and DOMDocument.使用以下字符: " & ' < > £进行测试。我的代码使用 PHP 和 DOMDocument 构建了一个 XML 文件。

<?php

 $xml = new DOMDocument();
 $xml->formatOutput = true;
 $root = $xml->createElement('Start_Of_XML');
 $xml->appendChild($root);

 $el = $xml->createElement($node,htmlspecialchars(html_entity_decode($value[$i],ENT_QUOTES,'UTF-8'),ENT_QUOTES,'UTF-8'));               
 $parent->appendChild($el);

?>

The htmlspecialchars() method above converts these chars to:上面的htmlspecialchars()方法将这些字符转换为:

" &amp; ' &lt; &gt; £

resp.分别That is, the double quote, apostrophe and pound sign fail to get encoded.也就是说,双引号、撇号和井号无法编码。

If I adjust the code to use htmlentities() instead:如果我调整代码以使用 htmlentities() 代替:

<?
 $el = $xml->createElement($node,htmlentities(html_entity_decode($value[$i],ENT_QUOTES,'UTF-8'),ENT_QUOTES,'UTF-8'));

?>

The chars get parsed as :字符被解析为:

" &amp; ' &lt; &gt; &pound;

So the pound sign gets converted along with the rest, but again the quote and apostrophe fail to get encoded when the XML is saved.因此,英镑符号与其余符号一起被转换,但在保存 XML 时,引号和撇号再次无法编码。

After searching through several posts I'm at a loss to find a solution?在搜索了几个帖子后,我不知如何找到解决方案?

Edit:编辑:

Using Gordon's answer as a basis I got the results I was looking for using something along the lines of https://3v4l.org/ZksrE使用戈登的回答作为基础,我得到了我正在寻找的结果,使用的是https://3v4l.org/ZksrE

Great effort from ThW though.尽管如此, ThW付出了巨大的努力。 Seems pretty comprehensive.看起来很全面。 I'm going to accept this as a solution.我将接受这个作为解决方案。 Thanks.谢谢。

The second argument of DOMDocument::createElement() is broken - it only escapes partly and it is not part of the W3C DOM standard. DOMDocument::createElement()的第二个参数被破坏 - 它只是部分转义,它不是 W3C DOM 标准的一部分。 In DOM the text content is a node.在 DOM 中,文本内容是一个节点。 You can just create it and append it to the element node.您可以创建它并将其附加到元素节点。 This works with other node types like CDATA sections or comments as well.这也适用于其他节点类型,如 CDATA 部分或注释。 DOMNode::appendChild() returns the appended node, so you can nest and chain the calls. DOMNode::appendChild()返回附加节点,因此您可以嵌套和链接调用。

Additionally you can set the DOMElement::$textContent property.此外,您可以设置DOMElement::$textContent属性。 This will replace all descendant nodes with a single text node.这将用单个文本节点替换所有后代节点。 Do not use DOMElement::$nodeValue - it has the same problems as the argument.不要使用DOMElement::$nodeValue - 它与参数有同样的问题。

$document = new DOMDocument();
$document->formatOutput = true;
$root = $document->appendChild($document->createElement('foo'));
$root
   ->appendChild($document->createElement('one'))
   ->appendChild($document->createTextNode('"foo" & <bar>'));
$root
   ->appendChild($document->createElement('one'))
   ->textContent = '"foo" & <bar>';
$root
   ->appendChild($document->createElement('two'))
   ->appendChild($document->createCDATASection('"foo" & <bar>'));
$root
   ->appendChild($document->createElement('three'))
   ->appendChild($document->createComment('"foo" & <bar>'));

echo $document->saveXML();

Output:输出:

<?xml version="1.0"?>
<foo>
  <one>"foo" &amp; &lt;bar&gt;</one>
  <one>"foo" &amp; &lt;bar&gt;</one>
  <two><![CDATA["foo" & <bar>]]></two>
  <three>
    <!--"foo" & <bar>-->
  </three>
</foo>

This will escape special characters (like & and < ) as needed.这将根据需要转义特殊字符(如&< )。 Quotes do need to be escaped so they won't.引号确实需要转义,所以他们不会。 Other special characters depend on the encoding.其他特殊字符取决于编码。

$document = new DOMDocument("1.0", "UTF-8");
$document
   ->appendChild($document->createElement('foo'))
   ->appendChild($document->createTextNode('äöü'));
echo $document->saveXML();

$document = new DOMDocument("1.0", "ASCII");
$document
   ->appendChild($document->createElement('foo'))
   ->appendChild($document->createTextNode('äöü'));
echo $document->saveXML();

Output:输出:

<?xml version="1.0" encoding="UTF-8"?> 
<foo>äöü</foo> 
<?xml version="1.0" encoding="ASCII"?> 
<foo>&#228;&#246;&#252;</foo>

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM