简体   繁体   English

无法从flarum获取带有php curl的目标div

[英]Can't get target div with php curl from flarum

I am trying to get latest article from flarum.org but it doesn't get posts. 我正在尝试从flarum.org获取最新文章,但没有帖子。 It works other normal sites, but doesn't work in flarum. 它可以在其他普通网站上使用,但不适用于flarum。

Here is my function: 这是我的功能:

function questions() {

    $url = 'https://discuss.flarum.org/';

    $curl = curl_init();
    curl_setopt( $curl, CURLOPT_URL, $url );
    curl_setopt( $curl, CURLOPT_HEADER, 0 );

    // SSL support
    curl_setopt( $curl, CURLOPT_SSL_VERIFYPEER, false );
    curl_setopt( $curl, CURLOPT_FOLLOWLOCATION, true );
    curl_setopt( $curl, CURLOPT_USERAGENT, $_SERVER[ 'HTTP_USER_AGENT' ] );

    // Variable support
    curl_setopt( $curl, CURLOPT_RETURNTRANSFER, true );

    $result = curl_exec( $curl );

    //echo $result;

    $result = str_replace( array( "\n", "\t", "\r" ), null, $result );

    preg_match_all( '#<div class="DiscussionListItem">(.*?)</div>#', $result, $match );

    print_r( $match );

    curl_close( $curl );

}

This function prints an empty array. 此函数打印一个空数组。

This is not how to parse HTML. 这不是解析HTML的方法。 Instead, use an HTML parser. 而是使用HTML解析器。 Something like this would work, if there were any matching elements in the HTML: 如果HTML中有任何匹配的元素,则类似的事情将起作用:

$url = "https://discuss.flarum.org/";
$html = file_get_contents($url);
$dom = new DomDocument();
libxml_use_internal_errors(true);
$dom->loadHTML($html);
$xpath = new DomXPath($dom);
$results = $xpath->query("//div[@class='DiscussionListItem']");
foreach ($results as $result) {
    echo $result->nodeValue;
}

Of course, there aren't any matching elements in the HTML. 当然,HTML中没有任何匹配的元素。 You might be better off modifying the XPath query to //div[@class='container']/ul/li/a instead. 您最好将XPath查询修改为//div[@class='container']/ul/li/a

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM