简体   繁体   中英

DOM Parser grabbing href of <a> tag by class=“Decision”

I'm working with a DOM parser and I'm having issues. I'm basically trying to grab the href within the tag that only contain the class ID of 'thumbnail '. I've been trying to print the links on the screen and still get no results. Any help is appreciated. I also turned on error_reporting(E_ALL); and still nothing.

$html = file_get_contents('http://www.reddit.com/r/funny');
$dom = new DOMDocument();
@$dom->loadHTML($html);
$classId = "thumbnail ";
$div = $html->find('a#'.$classId);
echo $div;

I also tried this but still had the same result of NOTHING:

include('simple_html_dom.php');
$html = file_get_contents('http://www.reddit.com/r/funny');
$dom = new DOMDocument();
@$dom->loadHTML($html);
// grab all the on the page
$xpath = new DOMXPath($dom);
$hrefs = $xpath->evaluate("/html/body//a");
$ret = $html->find('a[class=thumbnail]');
echo $ret;

You were almost there:

<?php
$dom = new DOMDocument();
@$dom->loadHTMLFile('http://www.reddit.com/r/funny');

$xpath = new DOMXPath($dom);
$hrefs = $xpath->evaluate("/html/body//a[contains(concat(' ',normalize-space(@class),' '),' thumbnail ')]");
var_dump($hrefs);

Gives:

class DOMNodeList#28 (1) {
  public $length =>
  int(25)
}

25 matches, I'd call it success.

This code would probably work:

$html = file_get_contents('http://www.reddit.com/r/funny');
$dom = new DOMDocument();
@$dom->loadHTML($html);

$xpath = new DOMXPath($dom);
$hyperlinks = $xpath->query('//a[@class="thumbnail"]');

foreach($hyperlinks as $hyperlink) {
   echo $hyperlink->getAttribute('href'), '<br>;'
}

if you're using simple_html_dom, why are you doing all these superfluous things? It already wraps the resource in everything you need -- http://simplehtmldom.sourceforge.net/manual.htm

include('simple_html_dom.php');

// set up:
$html = new simple_html_dom();

// load from URL:
$html->load_file('http://www.reddit.com/r/funny');

// find those <a> elements:
$links = $html->find('a[class=thumbnail]');

// done.
echo $links;

Tested it and made some changes - this works perfect too.

<?php
    // load the url and set up an array for the links
    $dom = new DOMDocument();
    @$dom->loadHTMLFile('http://www.reddit.com/r/funny');
    $links = array();

    // loop thru all the A elements found
    foreach($dom->getElementsByTagName('a') as $link) {
        $url = $link->getAttribute('href');
        $class = $link->getAttribute('class');

        // Check if the URL is not empty and if the class contains thumbnail
        if(!empty($url) && strpos($class,'thumbnail') !== false) {
            array_push($links, $url);
        }
    }

    // Print results
    print_r($links);
?>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM