简体   繁体   English

PHP的DomXPath无法按预期方式运行

[英]PHP's DomXPath not working the way it was expected

I'm trying to parse this HTML page: http://www.valor.com.br/valor-data/moedas 我正在尝试解析此HTML页面: http : //www.valor.com.br/valor-data/moedas

For a simple start, I'm trying to get all td elements with class="left" and echoing their inner texts. 作为一个简单的开始,我试图使用class="left"获取所有td元素,并回显其内部文本。 What I'm struggling to understand is why this code: 我努力理解的是这段代码的原因:

    $finder = new DomXPath($dom);
    $tds = $finder->query("//*[@class='left']");
    foreach ($tds as $td) {
        echo $td->textContent;
    }

gives me the expected output (a bunch of words that belong to those td elements which aren't worth pasting here) while this: 给我预期的输出(一堆属于这些td元素的单词,不应该在此处粘贴),而这是这样的:

    $finder = new DomXPath($dom);
    $tds = $finder->query("//td[@class='left']");
    foreach ($tds as $td) {
        echo $td->textContent;
    }

finds nothing. 一无所获。 I've also tried $finder->query("//td") to simply get all td elements, but it's like DomXPath doesn't recognize tag names. 我也尝试过$finder->query("//td")来简单地获取所有td元素,但这就像DomXPath无法识别标签名称一样。 Has anyone ever faced this same problem? 有没有人遇到过同样的问题?

I have not tested, but this is probably a namespace issue. 我尚未测试,但这可能是名称空间问题。 Your input page is XHTML and has correctly declared an XHTML namespace. 您的输入页面是XHTML,并且已正确声明了XHTML命名空间。 Therefore, you need to register a namespace prefix and use that prefix in your query. 因此,您需要注册一个名称空间前缀并在查询中使用该前缀。

Something like this 像这样

$finder = new DomXPath($dom);
$finder->registerNamespace("x", "http://www.w3.org/1999/xhtml");
$tds = $finder->query("//x:td[@class='left']");
foreach ($tds as $td) {
    echo $td->textContent;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM