How to get meta tags in php?

Question

我正在尝试导出以下url元标记，但无法正常工作，给出以下结果警告：get_meta_tags（ https://www.washingtonpost.com/politics/white-house-reels-as-fbi-director-contradicts-official -claims-about-alleged-abuser / 2018/02/13 / f010f256-10d9-11e8-9570-29c9830535e5_story.html？tid = pm_pop ）：无法打开流：已达到重定向限制，正在中止。对此有任何想法吗？

Answer 1

For start you need to make a call to the 1st page to set the cookie else it's not going to work

$ch = curl_init();
curl_setopt($ch, CURLOPT_HEADER, FALSE);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER,FALSE);
curl_setopt($ch,CURLOPT_URL,"https://www.washingtonpost.com");
curl_setopt($ch,CURLOPT_RETURNTRANSFER,TRUE);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/525.13 (KHTML, like Gecko) Chrome/0.A.B.C Safari/525.13");
$cookieName = "";
if(isset($_COOKIE['PHPSESSID'])){
    $cookieName = $_COOKIE['PHPSESSID'];
}
curl_setopt( $ch, CURLOPT_COOKIEJAR, $_SERVER['DOCUMENT_ROOT'].'/logs/'.$cookieName.'.txt'); 
curl_setopt( $ch, CURLOPT_COOKIEFILE, $_SERVER['DOCUMENT_ROOT'].'/logs/'.$cookieName.'.txt');
curl_exec($ch);
curl_close($ch);

then a second call to get the actual page

$ch = curl_init();
curl_setopt($ch, CURLOPT_HEADER, FALSE);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER,FALSE);
curl_setopt($ch,CURLOPT_URL,"https://www.washingtonpost.com/politics/white-house-reels-as-fbi-director-contradicts-official-claims-about-alleged-abuser/2018/02/13/f010f256-10d9-11e8-9570-29c9830535e5_story.html?tid=pm_pop");
curl_setopt($ch,CURLOPT_RETURNTRANSFER,TRUE);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/525.13 (KHTML, like Gecko) Chrome/0.A.B.C Safari/525.13");
$cookieName = "";
if(isset($_COOKIE['PHPSESSID'])){
    $cookieName = $_COOKIE['PHPSESSID'];
}
curl_setopt( $ch, CURLOPT_COOKIEJAR, LOG_DIR.'/'.$cookieName.'.txt');
curl_setopt( $ch, CURLOPT_COOKIEFILE, LOG_DIR.'/'.$cookieName.'.txt');
$page = curl_exec($ch);
curl_close($ch);

and finaly with DOMDocument we parse the dom tree

libxml_use_internal_errors(true);
$siteData = new DOMDocument();
$siteData->loadHTML($page);

$metaElements = $siteData->getElementsByTagName("meta");
if($metaElements->item(0)==null){
    echo "ERROR";
}

$meta = array();
for($i=0;$i<$metaElements->length;$i++){
    $meta[$i] = array();
    for($j=0;$j<$metaElements->item($i)->attributes->length;$j++){
        $meta[$i][$j] = array($metaElements->item($i)->attributes->item($j)->name,$metaElements->item($i)->attributes->item($j)->value);
    }
}
print_r($meta);

meta are stored in the $meta array

you can beautify this code by organizing curl to function.

How to get meta tags in php?

Question

1 answers

solution1
-1 2018-02-14 12:09:24

How to get meta tags in php?

Question

1 answers

solution1 -1 2018-02-14 12:09:24

solution1
-1 2018-02-14 12:09:24