How to extract innerHTML using the PHP Dom

Question

I'm currently using nodeValue to give me HTML output, however it is stripping the HTML code and just giving me plain text. Does anyone know how I can modify my code to give me the inner HTML of an element by using it's ID?

function getContent($url, $id){

// This first section gets the HTML stuff using a URL
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_BINARYTRANSFER, true);
$html = curl_exec($ch);
curl_close($ch);

// This second section analyses the HTML and outputs it
$newDom = new domDocument;
$newDom->loadHTML($html);
$newDom->preserveWhiteSpace = false;
$newDom->validateOnParse = true;

$sections = $newDom->getElementById($id)->nodeValue;
echo $sections;


}

Answer 1

This works for me:

$sections = $newDom->saveXML($newDom->getElementById($id));

http://www.php.net/manual/en/domdocument.savexml.php

If you have PHP 5.3.6, this might also be an option:

$sections = $newDom->saveHTML($newDom->getElementById($id));

http://www.php.net/manual/en/domdocument.savehtml.php

Answer 2

I have modify the code, and it's working fine for me. Please find below the code

    $ch = curl_init($url);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    curl_setopt($ch, CURLOPT_BINARYTRANSFER, true);
    $html = curl_exec($ch);
    curl_close($ch);
    $newDom = new domDocument;
    libxml_use_internal_errors(true);
    $newDom->loadHTML($html);
    libxml_use_internal_errors(false);
    $newDom->preserveWhiteSpace = false;
    $newDom->validateOnParse = true;

    $sections = $newDom->saveHTML($newDom->getElementById('colophon'));   
    echo $sections;

How to extract innerHTML using the PHP Dom

Question

2 answers

solution1
2 2012-03-09 14:03:03

solution2
0 2014-04-20 04:22:22

How to extract innerHTML using the PHP Dom

Question

2 answers

solution1 2 2012-03-09 14:03:03

solution2 0 2014-04-20 04:22:22

solution1
2 2012-03-09 14:03:03

solution2
0 2014-04-20 04:22:22