简体   繁体   中英

print content of page by using htmlentities dont work for google.com

I use this code for print content of web page(source code):

<?php
$url='http://cloob.com';
$ch=curl_init();
$timeout=5;

curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
curl_setopt( $ch, CURLOPT_FOLLOWLOCATION, true );
// Get URL content
$lines_string=curl_exec($ch);
// close handle to release resources
curl_close($ch);
var_dump( htmlspecialchars($lines_string));
//echo htmlentities($lines_string);
//var_dump( $lines_string);
?>

This is working but when I change the URL to https://google.com doesn't work, why?

It worked when I directly use (when I don't use htmlentities() ) in both situation... (I use http://phpfiddle.org/ )

First thing to do is to read the docs :

If the input string contains an invalid code unit sequence within the given encoding an empty string will be returned, unless either the ENT_IGNORE or ENT_SUBSTITUTE flags are set.

So, we might have a problem with PHP parsing the HTML, or encoding isn't right, or HTML isn't right.

When you use a proper encoding for google website, you get the result you want:

var_dump( htmlspecialchars($lines_string, ENT_COMPAT, 'ISO-8859-1'));

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM