簡體   English   中英

php curl CURLOPT_HEADER 和 DOM

[英]php curl CURLOPT_HEADER and DOM

我有以下代碼:

curl_setopt($ch, CURLOPT_URL, $host);
  curl_setopt($ch, CURLOPT_HEADER, 1); 
  curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
  curl_setopt($ch, CURLOPT_USERAGENT, $user_agent);
  curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
  curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
  $html = curl_exec($ch);


  preg_match_all('|Set-Cookie: (.*);|U', $html, $results);  
  $cookies = implode(';', $results[1]);


  $dom = new DOMDocument();
  $dom->loadHTML($html);

在線$dom->loadHTML($html); 我收到以下錯誤:

Warning: DOMDocument::loadHTML() [function.DOMDocument-loadHTML]:
Misplaced DOCTYPE declaration in
Entity, line: 12 in
D:\Programs\xampp\xampp\htdocs\ip\megafonmoscow.php
on line 39

Warning: DOMDocument::loadHTML()
[function.DOMDocument-loadHTML]:
htmlParseStartTag: misplaced 
tag in Entity, line: 13 in
D:\Programs\xampp\xampp\htdocs\ip\megafonmoscow.php
on line 39

Warning: DOMDocument::loadHTML()
[function.DOMDocument-loadHTML]:
htmlParseStartTag: misplaced 
tag in Entity, line: 14 in
D:\Programs\xampp\xampp\htdocs\ip\megafonmoscow.php
on line 39

Warning: DOMDocument::loadHTML()
[function.DOMDocument-loadHTML]:
Unexpected end tag : head in Entity,
line: 32 in
D:\Programs\xampp\xampp\htdocs\ip\megafonmoscow.php
on line 39

Warning: DOMDocument::loadHTML()
[function.DOMDocument-loadHTML]:
htmlParseStartTag: misplaced 
tag in Entity, line: 34 in
D:\Programs\xampp\xampp\htdocs\ip\megafonmoscow.php
on line 39

是線curl_setopt($ch, CURLOPT_HEADER, 1); 這個錯誤的原因? 因為餅干,我需要它。 關於如何解決這個問題的任何想法?

mck89 方法的替代方法是將標題和正文下載在一起,但在嘗試解析它之前將它們拆分:

$html = curl_exec($ch);

[snip]

$html = preg_replace('/^.*\n\n/s','',$html,1); // strip out everything before & including the double line break between headers and body

$dom = new DOMDocument();
$dom->loadHTML($html);

這節省了 HTTP 請求,因此節省了一定的時間。

嘗試刪除該行,以便它不會返回標頭,然后在 curl 請求之后使用get_headers函數獲取它們。

  curl_setopt($ch, CURLOPT_URL, $host);
  curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
  curl_setopt($ch, CURLOPT_USERAGENT, $user_agent);
  curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
  curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
  $html = curl_exec($ch);
  $headers=get_headers($host, 1);

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM