简体   繁体   中英

Using fopen, fwrite multiple times in a foreach loop

I want to save files from an external server into a folder on my server using fopen, fwrite.

First the page from the external site is loaded, and scanned for any image links. Then that list is sent from an to the fwrite function. The files are created, but they aren't the valid jpg files, viewing them in the browser it seems like their path on my server is written to them.

Here is the code:

//read the file
$data = file_get_contents("http://foo.html");

   //scan content for jpg links
preg_match_all('/src=("[^"]*.jpg)/i', $data, $result); 

//save img function
function save_image($inPath,$outPath)
{
    $in=    fopen($inPath, "rb");
    $out=   fopen($outPath, "wb");
    while ($chunk = fread($in,8192))
    {
        fwrite($out, $chunk, 8192);
    }
    fclose($in);
    fclose($out);
}

//output each img link from array
foreach ($result[1] as $imgurl) {
    echo "$imgurl<br />\n";
    $imgn = (basename ($imgurl));
    echo "$imgn<br />\n";
    save_image($imgurl, $imgn);
}

The save_image function works if I write out a list:

save_image('http://foo.html', foo1.jpg);
save_image('http://foo.html', foo1.jpg);

I was hoping that I'd be able to just loop the list from the matches in the array.

Thanks for looking.

There are two problems with your script. Firstly the quote mark is being included in the external image URL. To fix this your regex should be:

/src="([^"]*.jpg)/i

Secondly, the image URLs are probably not absolute (don't include http:// and the file path). Put this at the start of your foreach to fix that:

$url = 'http://foo.html';
# If the image is absolute.
if(substr($imgurl, 0, 7) == 'http://' || substr($imgurl, 0, 8) == 'https://')
{
  $url = '';
}
# If the image URL starts with /, it goes from the website's root.
elseif(substr($imgurl, 0, 1) == '/')
{
  # Repeat until only http:// and the domain remain.
  while(substr_count($url, '/') != 2)
  {
    $url = dirname($url);
  }
}
# If only http:// and a domain without a trailing slash.
elseif(substr_count($imgurl, '/') == 2)
{
  $url .= '/';
}
# If the web page has an extension, find the directory name.
elseif(strrpos($url, '.') > strrpos($url, '/'))
{
  $url = dirname($url);
}
$imgurl = $url. $imgurl;

fopen isn't guaranteed to work. You should be checking the return values of anything they may return something different on error...

fopen() - Returns a file pointer resource on success, or FALSE on error.

In fact all the file functions return false on error.

To figure out where it is failing I would recommend using a debugger, or printing out some information in the save_image function. ie What the $inPath and $outPath are, so you can validate they are being passed what you would expect.

The main issue I see is that the regex may not capture the full http:// path. Most sites leave this off and use relative paths. You should code in a check for that and add it in if that is not present.

Your match includes the src bit, so try this instead:

preg_match_all('/(?<=src=")[^"]*.jpg/i', $data, $result); 

And then I think this should work:

unset($result[0]);
//output each img link from array
foreach ($result as $imgurl) {
    echo "$imgurl<br />\n";
    $imgn = (basename ($imgurl));
    echo "$imgn<br />\n";
    save_image($imgurl, $imgn);
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM