简体   繁体   中英

Can't get target div with php curl from flarum

I am trying to get latest article from flarum.org but it doesn't get posts. It works other normal sites, but doesn't work in flarum.

Here is my function:

function questions() {

    $url = 'https://discuss.flarum.org/';

    $curl = curl_init();
    curl_setopt( $curl, CURLOPT_URL, $url );
    curl_setopt( $curl, CURLOPT_HEADER, 0 );

    // SSL support
    curl_setopt( $curl, CURLOPT_SSL_VERIFYPEER, false );
    curl_setopt( $curl, CURLOPT_FOLLOWLOCATION, true );
    curl_setopt( $curl, CURLOPT_USERAGENT, $_SERVER[ 'HTTP_USER_AGENT' ] );

    // Variable support
    curl_setopt( $curl, CURLOPT_RETURNTRANSFER, true );

    $result = curl_exec( $curl );

    //echo $result;

    $result = str_replace( array( "\n", "\t", "\r" ), null, $result );

    preg_match_all( '#<div class="DiscussionListItem">(.*?)</div>#', $result, $match );

    print_r( $match );

    curl_close( $curl );

}

This function prints an empty array.

This is not how to parse HTML. Instead, use an HTML parser. Something like this would work, if there were any matching elements in the HTML:

$url = "https://discuss.flarum.org/";
$html = file_get_contents($url);
$dom = new DomDocument();
libxml_use_internal_errors(true);
$dom->loadHTML($html);
$xpath = new DomXPath($dom);
$results = $xpath->query("//div[@class='DiscussionListItem']");
foreach ($results as $result) {
    echo $result->nodeValue;
}

Of course, there aren't any matching elements in the HTML. You might be better off modifying the XPath query to //div[@class='container']/ul/li/a instead.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM