简体   繁体   中英

How to use fsockopen to load a url from an xml sitemap

I am attempting to load each url in a sitemap.xml file in an effort to pre-cache them and speed up the users experience.

I have the following code which grabs the urls from the sitemap

$ch = curl_init();
/**
* Set the URL of the page or file to download.
*/
curl_setopt($ch, CURLOPT_URL, 'http://onlineservices.letterpart.com/sitemap.xml;jsessionid=1j1agloz5ke7l?id=1j1agloz5ke7l');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$data = curl_exec ($ch);
curl_close ($ch);

$xml = new SimpleXMLElement($data);
foreach ($xml->url as $url_list) {
    $url = $url_list->loc;
    echo $url ."<br>";    
}

and I am now trying to use fsockopen to load each url in turn.

where $url is in this format: http://onlineservices.letterpart.com:80/content/en/FAMILY-201103311115/Family_FLJONLINE_FLJ_2009_07_4

foreach ($xml->url as $url_list) {
        $url = $url_list->loc;
$fp = fsockopen ($url,80);
  if ($fp) {
 fwrite($fp, "GET / HTTP/1.1\r\nHOST: $url\r\n\r\n");

        while (!feof($fp)) {
            print fread($fp,256);
        }

        fclose ($fp);
    } else {
        print "Fatal error\n";
    }
}

But this is giving me this error for each url:

[12-May-2011 13:34:09] PHP Warning: fsockopen() [function.fsockopen]: unable to connect to http://onlineservices.letterpart.com:80/content/en/FAMILY-201103311115/Family_FLJONLINE_FLJ_2009_07_4:-1 (Unable to find the socket transport "http" - did you forget to enable it when you configured PHP?) in /home/digital1/public_html/dev/sitemap.php on line 32

I have read that I need to: "just the hostname, not the URL in the fsockopen call. You'll need to provide the uri, minus the host/port in the actual HTTP headers"

so I tried this:

 $fp = fsockopen ("http://onlineservices.letterpart.com",80);
  if ($fp) {
 fwrite($fp, "GET / HTTP/1.1\r\nHOST: content/en/FAMILY-201103311115/Family_FLJONLINE_FLJ_2009_07_4\r\n\r\n");

        while (!feof($fp)) {
            print fread($fp,256);
        }

        fclose ($fp);
    } else {
        print "Fatal error\n";
    }

But I still get the same error.

EDIT:

If I change the fsockopen call to:

$fp = fsockopen ("onlineservices.letterpart.com",80);

then I get a slightly different and better but still wrong response. it seems to be ignoring the onlineservices.letterpart.com section and trying http:///content/ BUT... it has appended: /web/ui.xql?action=html&resource=login.html tot he end of the url which is our login page so it must be seeing our server...

HTTP/1.1 302 Moved Temporarily Date: Thu, 12 May 2011 14:40:02 GMT Server: Jetty/5.1.12 (Windows 2003/5.2 x86 java/1.6.0_07 Expires: Thu, 01 Jan 1970 00:00:00 GMT Set-Cookie: JSESSIONID=nh62zih3q8mf;Path=/ Location: http:///content/en/FAMILY-201103311115/Family_FLJONLINE_FLJ_2009_07_4/web/ui.xql?action=html&resource=login.html Content-Length: 0

Thanks.

fsockopen is not attented to be used for HTTP request, Curl is a better choice (and much more powerful).

There is also file_get_contents which can make it quick:

foreach ($xml->url as $url_list) {
    $url = $url_list->loc;
    file_get_contents($url);
}

Usefull for application cache warmup!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM