从php网页上的div提取类内容的XPATH查询是什么？

Question

我已经编写了以下代码，但它只返回空数据：

enter code here 
$code="CS225";

$url="https://cs.illinois.edu/courses/profile/{$code}";
echo $url;
$html = file_get_contents($url); 

$pokemon_doc = new DOMDocument();

libxml_use_internal_errors(TRUE); //disable libxml errors

if(!empty($html)){ //if any html is actually returned

    $pokemon_doc->loadHTML($html);
    libxml_clear_errors(); 

    $pokemon_xpath = new DOMXPath($pokemon_doc);

    $pokemon_row = $pokemon_xpath->query("//div[@id='extCoursesDescription']");

    if($pokemon_row->length > 0){
        foreach($pokemon_row as $row){
            echo $row->nodeValue . "<br/>";
        }
    }
}

我要抓取的网站是： https : //cs.illinois.edu/courses/profile/CS225

Answer 1

课程内容似乎是由加载页面上的源代码加载的。 但是，如果您浏览了已加载的源代码，则可以...

<script type='text/javascript' src='//ws.engr.illinois.edu/courses/item.asp?n=3&course=CS225'></script>

从中可以找到URL http://ws.engr.illinois.edu/courses/item.asp?n=3&course=CS225 ，这将为您提供实际的内容。 因此，使用原始网址而不是原始网址，您应该能够从那里提取信息。

尽管此内容全部包装在document.write() 。

更新：

删除document()位-一种简单的方法是仅处理内容...

$html = file_get_contents($url);

$html = str_replace(["document.write('","');"], "", $html);
$html = str_replace('\"', '"', $html);

从php网页上的div提取类内容的XPATH查询是什么？

问题描述

1 个解决方案

解决方案1
2 已采纳 2018-04-26 18:45:46

从php网页上的div提取类内容的XPATH查询是什么？

问题描述

1 个解决方案

解决方案1 2 已采纳 2018-04-26 18:45:46

解决方案1
2 已采纳 2018-04-26 18:45:46