I am trying to scrape some content using simple_html_dom without luck.
I am trying to grab the title, image path and the link and display it.
The HTML structure is:
<div class="article_item clearfix">
<h2 class="title"><a href="http://www.demodomain/articleid=1">My amazing Title</a></h2>
<p class="date">September 22 2014</p>
<p class="image_left">
<a href="http://www.demodomain/articleid=1">
<img src="http://www.demodomain/photos/cef78533cd5.jpg" alt="My amazing post ">
</a>
</p>
<p>This is a demo description<strong>of this amazing</strong> article</p>
<p class="more"><a href="http://www.demodomain/articleid=1" class="blued_links">Read more...</a></p>
</div>
My code so far:
foreach($html->find('article_item') as $article) {
$item['title'] = $article->find('.title, a', 0)->plaintext;
$item['thumb'] = $article->find('.image_left img', 0)->src;
$item['details'] = $article->find('p', 0)->plaintext;
$item['url'] = $article->find('.more, a', 0)->plaintext;
echo 'Title: ' . $item['title'];
echo "</br>";
echo "image url: " . $item['thumb'];
echo "</br>";
echo "Description: " . $item['details'];
echo "</br>";
echo "Read More Url: " . $item['url'];
}
// Clear dom object
$html->clear();
unset($html);
You didn't state whats not working but consider this example:
foreach($html->find('div.article_item') as $div) {
// ^ point to div tag with class name article_item
$title = $div->find('h2.title a ', 0)->innertext;
// ^ target the h2 tag with class title with child anchor
// just same as accessing dom with jquery
$thumb = $div->find('p.image_left img ', 0)->src;
$details = $div->children(3)->plaintext;
// $url = $div->find('p.more', 0)->plaintext;
$url = $div->find('p.more a', 0)->href;
echo $title . '<br/>';
echo $thumb . '<br/>';
echo $details . '<br/>';
echo $url . '<br/>';
}
Basically, this is just the same as selecting selectors.
can you try like this
$item['title'] = $article->find('h2.title')->plaintext;
$item['thumb'] = $article->find('p.image_left')->find('img')->src;
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.