简体   繁体   中英

php code to load page source of http URL

I am trying to find the PHP code that will load the source page of a URL on my screen - actually I am trying to do more but this is the first step I want to achieve in a clean reliable manner. Most postings say this has been asked and replied several times but nothing seems to work reliably for me and most postings are old. On top of that I am very very new to PHP or any web programming. Anyway I did find some codes using cURL, DOM or just direct functions that work but very sensitive to the PHP version. PHP 5.2, 5.3, 5.5 and 5.6 are the versions available from my hosting service. The ones that work in some versions, load (display) the URL page itself or in a "bulleted" manner w/o the images - but nothing that looks like the html document when we do a "view page source" on any web page. So my question is is this something not possible at all or am I missing something here? One of the DOM codes that echoes the page but not it's source and that too only in 5.2 and 5.5 is:

<?php
$dom = new DOMDocument();
libxml_use_internal_errors(true);
$dom->loadHTMLFile('http://www.cisco.com/');
echo $dom->saveHTML();
?>

One of my other important need is for my PHP codes to work in 5.3 at least for now, but would like em to work in 5.2 thru 5.5 if possible. Any pointers please?

The issue is that when you echo the HTML, the browser interprets it as HTML. If you want to see it as "source", you need to either escape the HTML:

echo htmlspecialchars($dom->saveHTML());

or set the content type to text:

header("Content-Type:text/plain");
echo $dom->saveHTML();

You can replace the < and > with the proper HTML entities so the source will show on the screen rather than being parsed as source by the browser:

echo str_replace('>', '&gt;', str_replace('<', '&lt;', $dom->saveHTML()));

Or echo htmlspecialchars($dom->saveHTML()); which is cleaner ... but the above at least gives you a glimpse of what htmlspecialchars is actually doing

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM