使用DOMDocument和DOMXPath类从网站中检索类别

Question

我有一个PHP代码，可以使用类名“ sub-title”从此网站检索类别。 但是，输出不显示任何内容。 我究竟做错了什么？

PHP代码：

<?php
header('Content-Type: text/html; charset=utf-8');
$grep = new DoMDocument();
@$grep>loadHTMLFile("http://www.alibaba.com/Products",false,stream_context_create(array("http" => array("user_agent" => "any"))));

$finder = new DomXPath($grep);
$class = "sub-title";
$nodes = $finder->query("//*[contains(@class, '$class')]");

foreach ($nodes as $node) {
    $span = $node->childNodes;
    echo $span->item(0)->nodeValue;

}

?>

所需的输出：
农业
食品与饮料
服饰
等等..

谢谢！

Answer 1

仅针对该特定元素。 顺便说一下，您当前的代码在$grep>loadHTMLFile上有一个错字。 它遗漏-在-> 。 我做了一点修改。

$ch = curl_init('http://www.alibaba.com/Products');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_USERAGENT,'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13');
$html = curl_exec($ch);
$dom = new DOMDocument();
@$dom->loadHTML($html);
$finder = new DOMXPath($dom);
$nodes = $finder->query('//h4[@class="sub-title"]');

foreach ($nodes as $node) {
    $sub_title = trim(explode("\n", trim($node->nodeValue))[0]) . '<br/>';
    echo $sub_title;
}

Answer 2

要在使用DOMDocument::loadHTMLFile来获取HTML时设置流上下文，请使用libxml_set_streams_context ：

<?php

$context = stream_context_create(array('http' => array('user_agent' => 'any')));
libxml_set_streams_context($context);
libxml_use_internal_errors(true);

$doc = new DOMDocument();
$doc->loadHTMLFile('http://www.alibaba.com/Products');

$xpath = new DOMXPath($doc);
$nodes = $xpath->query('//h4[@class="sub-title"]/a');

foreach ($nodes as $node) {
    echo trim($node->textContent) . "\n";
}

使用DOMDocument和DOMXPath类从网站中检索类别

问题描述

2 个解决方案

解决方案1
1 已采纳 2014-08-27 02:30:56

解决方案2
1 2014-08-27 08:16:13

使用DOMDocument和DOMXPath类从网站中检索类别

问题描述

2 个解决方案

解决方案1 1 已采纳 2014-08-27 02:30:56

解决方案2 1 2014-08-27 08:16:13

解决方案1
1 已采纳 2014-08-27 02:30:56

解决方案2
1 2014-08-27 08:16:13