简体   繁体   中英

PHP's DomXPath not working the way it was expected

I'm trying to parse this HTML page: http://www.valor.com.br/valor-data/moedas

For a simple start, I'm trying to get all td elements with class="left" and echoing their inner texts. What I'm struggling to understand is why this code:

    $finder = new DomXPath($dom);
    $tds = $finder->query("//*[@class='left']");
    foreach ($tds as $td) {
        echo $td->textContent;
    }

gives me the expected output (a bunch of words that belong to those td elements which aren't worth pasting here) while this:

    $finder = new DomXPath($dom);
    $tds = $finder->query("//td[@class='left']");
    foreach ($tds as $td) {
        echo $td->textContent;
    }

finds nothing. I've also tried $finder->query("//td") to simply get all td elements, but it's like DomXPath doesn't recognize tag names. Has anyone ever faced this same problem?

I have not tested, but this is probably a namespace issue. Your input page is XHTML and has correctly declared an XHTML namespace. Therefore, you need to register a namespace prefix and use that prefix in your query.

Something like this

$finder = new DomXPath($dom);
$finder->registerNamespace("x", "http://www.w3.org/1999/xhtml");
$tds = $finder->query("//x:td[@class='left']");
foreach ($tds as $td) {
    echo $td->textContent;
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM