简体   繁体   中英

PHP convert string to htmlentities

How can I convert the code inside the <code> and <pre> tags to html entities ?

<code class="php"> <div> a div.. </div> </code>

<pre class="php">
<div> a div.. </div>
</pre>

<div> this should be ignored </div>

You can use jquery. This will encode anything inside any tags with a class code .

$(".code").each(
    function () {
        $(this).text($(this).html()).html();
    }
);

The fiddle: http://jsfiddle.net/mazzzzz/qnbLL/

PHP

if(preg_match_all('#\<(code|pre) class\=\"php\"\>(.*?)\</(code|pre)\>#is', $html, $code)){
    unset($code[0]);
    foreach($code as $array){
        foreach($array as $value){
            $html = str_replace($value, htmlentities($value, ENT_QUOTES), $html);
        }
    }
}

HTML

<code class="php"> &lt;div&gt; a div.. &lt;/div&gt; </code>

<pre class="php">
&lt;div&gt; a div.. &lt;/div&gt;
</pre>

<div> this should be ignored </div>

Have you ever heard of BB code? http://en.wikipedia.org/wiki/BBCode

OK, I've been playing with this for a while. The result may not be the best or most direct solution (and, frankly, I disagree with your approach entirely if arbitrary users are going to be submitting the input), but it appears to "work". And, most importantly, it doesn't use regexes for parsing XML. :)

Faking the input

<?php

$str = <<<EOF
<code class="php"> <div> a div.. </div> </code>

<pre class="php">
<div> a div.. </div>
</pre>

<div> this should be ignored </div>
EOF;

?>

Code

<?php

function recurse(&$doc, &$parent) {
   if (!$parent->hasChildNodes())
      return;

   foreach ($parent->childNodes as $elm) {

      if ($elm->nodeName == "code" || $elm->nodeName == "pre") {
         $content = '';
         while ($elm->hasChildNodes()) { // `for` breaks the `removeChild`
             $child = $elm->childNodes->item(0);
             $content .= $doc->saveXML($child);
             $elm->removeChild($child);
         }
         $elm->appendChild($doc->createTextNode($content));
      }
      else {
         recurse($doc, $elm);
      }
   }
}

// Load in the DOM (remembering that XML requires one root node)
$doc = new DOMDocument();
$doc->loadXML("<document>" . $str . "</document>");

// Iterate the DOM, finding <code /> and <pre /> tags:
recurse($doc, $doc->documentElement);

// Output the result
foreach ($doc->childNodes->item(0)->childNodes as $node) {
   echo $doc->saveXML($node);
}

?>

Output

<code class="php"> &lt;div&gt; a div.. &lt;/div&gt; </code>

<pre class="php">
&lt;div&gt; a div.. &lt;/div&gt;
</pre>

<div> this should be ignored </div>

Proof

You can see it working here .

Note that it doesn't explicitly call htmlspecialchars ; the DOMDocument object handles the escaping itself.

I hope that this helps. :)

This is related somewhat, you do not have to use Geshi, but I wrote a bit of code here Advice for implementing simple regex (for bbcode/geshi parsing) that would help you with the problem.

It can be tweaked to not use GeSHi, just would take a bit of tinkering. Hope it helps ya.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM