简体   繁体   中英

DOMDocument stripping tags from inline scripts PHP

This is a strange one but looks like $dom->saveHTML() is stripping tags from inline javascript

$domStr = '
<!DOCTYPE html>
   <html>
    <head>
        <meta charset="utf-8"/>
        <title>my page</title>
        <script>
            var elem = "<div>some content</div>";
        </script>
    </head>
    <body>
        <div>
            MY PAGE
        </div>
    </body>
</html>
';
    $doc = new DOMDocument();
    libxml_use_internal_errors(true);//prevents tags in js from throwing errors; see php.net manual
    $doc->formatOutput = true;
    $doc->strictErrorChecking = false;
    $doc->preserveWhiteSpace  = true;

    $doc->loadHTML($domStr);
    echo $doc->saveHTML();
exit;

http://sandbox.onlinephpfunctions.com/code/ad59a2a1016b2128e437ef61dbe00f1c511bff8d

if you use libxml_use_internal_errors(true); you will not see what is wrong but if removed you get

<b>Warning</b>:  DOMDocument::loadHTML(): Unexpected end tag : div

Same thing happens with

$doc->formatOutput = false;

Any help is appreciated.

I've avoided this by not including any HTML in my inline JavaScript. Instead, I've added <template> elements containing the HTML string I want to manipulate in JS, and then I read that dynamically at runtime. For example:

<!DOCTYPE html>
<html>
    <head>
        <meta charset="utf-8"/>
        <title>my page</title>
    </head>

    <body>
        <div>
            MY PAGE
        </div>

        <template id="content-template">
            <div>some content</div>
        </template>

        <script>
            var elem = document.getElementById('content-template').innerHTML;
            ...
        </script>
    </body>
</html>

It is probably a bug of DomDocument.

You have to escape the closing tag of HTML in JS or it gets misinterpreted.

This should work var elem = "<div>some content<\/div>";

Alternatively, if you pass option 1 to the loadHtml the parser will ignore it.

In a bit of an oddity 1 can mean both LIBXML_SCHEMA_CREATE and LIBXML_ERR_WARNING as these two predefined constants have the same value. Presumably it is meant to be LIBXML_SCHEMA_CREATE which does the following "Create default/fixed value nodes during XSD schema validation".

DOCTYPE声明之后,您丢失了<html>开头的标记。

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM