简体   繁体   中英

Getting content between two div tags using Simple Html Dom

I'm using Simple HTML Dom to parse text between HTML tags.Everything went well until i faced this challenge.I can easily parse text within div tags but how can I parse text between two div tags.

This is the HTML to be parsed:

<div class="album"><b>Album1</b> (1997)</div>
<a href="song11.html" target="_blank">song11</a><br />
<a href="song12.html" target="_blank">song12</a><br />

<div class="album"><b>Album2</b> (1998)</div>
<a href="song21.html" target="_blank">song21</a><br />
<a href="song22.html" target="_blank">song22</a><br />

<div class="album"><b>Album3</b> (1999)</div>
<a href="song31.html" target="_blank">song31</a><br />
<a href="song32.html" target="_blank">song32</a><br />

I want the first album title (Album1), its year (1997) and both the song links with their titles in a single array. Then the second album in a second array and the third album in a third array.

Don't think of it as text between two div nodes, think of it as iterating the div nodes and including some of the a nodes that follow them:

$html =<<<EOF
<div class="album"><b>Album1</b> (1997)</div>
<a href="song11.html" target="_blank">song11</a><br />
<a href="song12.html" target="_blank">song12</a><br />
<div class="album"><b>Album2</b> (1998)</div>
<a href="song21.html" target="_blank">song21</a><br />
<a href="song22.html" target="_blank">song22</a><br />
<div class="album"><b>Album3</b> (1999)</div>
<a href="song31.html" target="_blank">song31</a><br />
<a href="song32.html" target="_blank">song32</a><br />
EOF;

require('simple_html_dom.php');
$doc = str_get_html($html);
$albums = array();

foreach($doc->find('div.album') as $div){
  $album = array();
  $album['title'] = $div->find('b', 0)->innertext;
  $album['song1'] = $div->nextSibling()->innertext;
  $albums[] = $album;
}

var_dump($albums);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM