简体   繁体   中英

Downloading File from a URL using PHP script

Hi I want to download some 250 files from a URL which are in a sequence. I am almost done with it! Just the Problem is the structure of my URL is: http://lee.kias.re.kr/~newton/sann/out/201409/ /SEQUENCE1.prsa

Where id is in a sequence but the file name "SEQUENCE1.psra" has a format "SEQUENCE?.psra". Is there any way I can specify this format of file in my code? And also there are other files in folder, but only 1 with ".psra" ext.

Code:
<?php
// Source URL pattern
//$sourceURLOriginal = "http://www.somewebsite.com/document{x}.pdf";
 $sourceURLOriginal = " http://lee.kias.re.kr/~newton/sann/out/201409/{x}/**SEQUENCE?.prsa**";
// Destination folder
 $destinationFolder = "C:\\Users\\hp\\Downloads\\SOP\\ppi\\RSAdata";
// Destination file name pattern
 $destinationFileNameOriginal = "doc{x}.txt";
 // Start number
   $start = 7043;
// End number
$end = 7045;
 $n=1;
// From start to end
for ($i=$start; $i<=$end; $i++) {
// Replace source URL parameter with number

    $sourceURL = str_replace("{x}", $i, $sourceURLOriginal);

    // Destination file name
    $destinationFile = $destinationFolder . "\\" . 
    str_replace("{x}", $i, $destinationFileNameOriginal);
// Read from URL, write to file
    file_put_contents($destinationFile, 
    file_get_contents($sourceURL)
    );
// Output progress
echo "File #$i complete\n";
}
?>

Its working if I directly specify the URL!

Error: Warning: file_get_contents( http://lee.kias.re.kr/~newton/sann/out/201409/7043/SEQUENCE?.prsa ): failed to open stream: Invalid argument in C:\\xampp\\htdocs\\SOP\\download.php on line 37 File #7043 complete

Its making the files but they are empty!

If there is a way in which I can download that whole folder(named with id in sequence) can also work! But how do we download the whole folder in a folder?

It may be possible file_get_contents() function is not working on your server. Try this code :

    function url_get_contents ($Url) {
        if (!function_exists('curl_init')){ 
            die('CURL is not installed!');
        }
        $ch = curl_init();
        curl_setopt($ch, CURLOPT_URL, $Url);
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
        $output = curl_exec($ch);
        curl_close($ch);
        return $output;
    }

Here you go.
I didnt test the whole file_get_contents , file_put_contents part, but if you say its adding the files (albeit, blank) then I assume it still works here...

Everything else works fine. I left a var_dump() in so you can see what the return looks like.

I did what I suggested in my comment. Open the folder, parse the file list, grab the filename you need.
Also, I dont know if you read my original comments, but $sourceURLOriginal has an extra space at the beginning, which might have been giving you an issue.

<?php

$start=7043;
$end=7045;

$sourceURLOriginal="http://lee.kias.re.kr/~newton/sann/out/201409/";
$destinationFolder='C:\Users\hp\Downloads\SOP\ppi\RSAdata';

for ($i=$start; $i<=$end; $i++) {
    $contents=file_get_contents($sourceURLOriginal.$i);
    preg_match_All("|href=[\"'](.*?)[\"']|",$contents,$hrefs);
    $file_list=array();
    if (empty($hrefs[1])) continue;
    unset($hrefs[1][0],$hrefs[1][1],$hrefs[1][2],$hrefs[1][3],$hrefs[1][4]);
    $file_list=array_values($hrefs[1]);
    var_dump($file_list);

    foreach ($file_list as $index=>$file) {
        if (strpos($file,'prsa')!==false) {
            $needed_file=$index;
            break;
        }
    }

    file_put_contents($destinationFolder.'\doc'.$i.'.txt',
        file_get_contents($sourceURLOriginal.$i.'/'.$file_list[$needed_file])
    );

}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM