简体   繁体   中英

Wrap any HTML tags when using //text() in PHP: DOMXPath

I Have the following HTML :

<div id="ABC">
    <i>Lorem Ipsum</i> is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book.
    <br>
    It has survived not only <b>five centuries</b>, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with <i>desktop publishing software</i> like Aldus PageMaker including versions of Lorem Ipsum.
</div>

I'm using the following query to store ABC content in an array:

foreach ( $xpath->query('//div[@id="ABC"]/text() | //div[@id="ABC"]/i | //div[@id="ABC"]/b') as $text ) {
     $data['content'][] = $text->nodeValue; 
}

And the output something like this:

   [content] => Array
        (
            [0] => Lorem Ipsum
            [1] => is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book.
            [2] => It has survived not only
            [3] => five centuries
            [4] => , but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with
            [5] => desktop publishing software
            [6] => like Aldus PageMaker including versions of Lorem Ipsum.
   )

Is it possible if i want the output like this?

   [content] => Array
        (
            [0] => Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book.
            [1] => It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.
   )

What you can do is accumulate the text nodes in a string until you encounter a br node. At that point you add the accumulated string to your $data['content'] array and reset the string to nothing. And at the end of the loop, you'll also need to add the accumulated string to the array if it isn't empty.

So the loop should look something like this:

$line = '';
foreach ( $xpath->query('//div[@id="ABC"]/text() | //div[@id="ABC"]/i | //div[@id="ABC"]/b  | //div[@id="ABC"]/br') as $text ) {
  if ($text->nodeName == 'br') {
    $data['content'][] = $line;
    $line = '';
  }
  else
    $line .= $text->nodeValue;
}
if ($line) $data['content'][] = $line;

Note that I've added a //div[@id="ABC"]/br query to your $xpath->query call so that the br node is returned in the loop.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM