简体   繁体   中英

DomXpath and foreach. How to get a preview of the captured elements?

I am learning to deal with DOMXpath in php . I was using regex (but I was discouraged here in the stack when for html capture). I confess that for me it is not so simple and the DOM has its limits (when there are spaces in tag names and also in error handling). If someone can help me with the command in php to get a preview of the captured elements and check if everything is right, I would appreciate it. If you have suggestions for improving the code, you're welcome to do so.The code below was based on a question in Stackoverflow itself.

<?php
    $doc = new DOMDocument;
    libxml_use_internal_errors(true);
    // Deleting whitespace (if any)
    $doc->preserveWhiteSpace = false;
    @$doc->loadHTML(file_get_contents ('http://www.imdb.com/search/title?certificates=us:pg_13&genres=comedy&groups=top_250'));
    $xpath = new DOMXPath($doc);
    // Starting from the root element
    $grupos = $xpath->query(".//*[@class='lister-item mode-advanced']");
    // Creating an array and then looping with the elements to be captured (image, title, and link)
    $resultados = array();
    foreach($grupos as $grupo) {
        $i = $xpath->query(".//*[@class='loadlate']//@src", $grupo);
        $t = $xpath->query(".//*[@class='lister-item-header']//a/text()", $grupo);
        $l = $xpath->query(".//*[@class='lister-item-header']//a/@href", $grupo);

    $resultados[] = $resultado;

}
// What command should I use to have a preview of the results and check if everything is ok?
print_r($resultados);

OK, so here your code with two corrections. First I'm adding a subarray to $resultados with the elements, and seconds I'm making a foreach instead of print_r/var_dump

BTW, doesn't imdb offer an API?

    <?php 
    ini_set('display_errors', 1);
    error_reporting(-1);

    $doc = new DOMDocument;
    libxml_use_internal_errors(true);
    // Deleting whitespace (if any)
    $doc->preserveWhiteSpace = false;
    $doc->loadHTML(file_get_contents ('http://www.imdb.com/search/title?certificates=us:pg_13&genres=comedy&groups=top_250'));
    //$doc->loadHTML($HTML);
    $xpath = new DOMXPath($doc);
    // Starting from the root element
    $grupos = $xpath->query(".//*[@class='lister-item mode-advanced']");
    // Creating an array and then looping with the elements to be captured (image, title, and link)
    $resultados = array();
    foreach($grupos as $grupo) {
        $i = $xpath->query(".//*[@class='loadlate']//@src", $grupo);
        $t = $xpath->query(".//*[@class='lister-item-header']//a/text()", $grupo);
        $l = $xpath->query(".//*[@class='lister-item-header']//a/@href", $grupo);

    $resultados[] = ['i' => $i[0], 't' => $t[0], 'l' => $l[0]];

}
// What command should I use to have a preview of the results and check if everything is ok?
//var_dump($resultados);
foreach($resultados as $r){
    echo "\n-----------\n";
    echo $r['i']->value."\n";
    echo $r['t']->textContent."\n";
    echo $r['l']->value."\n";
}

You can play with it here: https://3v4l.org/hal0G

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM