简体   繁体   中英

PHP save inner html of p tag, only 1 p tag on page

I am trying to get the inner html of a <p> tag and save it as a .txt file. It is a very simple page; there is only one <p> on it. I tried using getElementsByTagName('p') as per: Using PHP to get DOM Element . Unfortunately, it didn't work for me, but maybe I'm missing something. My code is:

<?php
$dataPage = file_get_contents('http://www.somedataurl.com');
$doc = new DOMDocument;
$doc->loadHTML($dataPage);

$dataNodeList = $doc->getElementsByTagName('p');
$dataNode = $dataNodeList->item(0);

function innerHTML($node) {
    return implode(array_map([$node->ownerDocument, "saveHTML"],
            iterator_to_array($node->childNodes)));
}

$theData = innerHTML($dataNode);

header('Content-Type: text/plain');
$filename = date('Y-m-d') . '.txt';
file_put_contents($filename, $theData);

The error log is giving me:

PHP Notice: Undefined property:: DOMNodeList (line 10)

PHP Notice: Undefined property:: DOMNodeList (line 11)

PHP Catchable fatal error (line 11)

These errors sound rather alarming, especially the last one.

Question: Is there a better tool I can use other than getElementsByTagName() since I am only dealing with one <p> ? Or can this way work if I adjust a few things?

if there is only one P tag,i think you had better extract P content using Regular Expressions

example:

preg_match("/<p>(.*?)<\/p>/is",$dataPage,$match);
print_r($match[1]);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM