简体   繁体   中英

shell script in bash to download file from ftp server

i have to write a shell script for bash shell to transfer file from ftp server given
ftp server -- fileserver@example.com
user user1
password pass1

now in /dir1/dir2 at ftp server i have folder in following forms
0.7.1.70
0.7.1.71
0.7.1.72

i have to copy file "file1.iso" from the latest folder ie 0.7.1.72 in this case. i have to also check integrity of the file while copying ie suppose the file are being uploaded to the server and at that time if i start copying in this case copying will not be complete.

i have to do it after every 4 hour. this can be done by making it a cron job. please help

i have done this i mounted the ftp server folder to my local machine. for checking if the file has been completely uploaded or not i am checking the size after every 50 sec for 5 times if it is same then i am copying it otherwise run the script after 4 hr... i have maintained a text file " foldernames.txt" which have name of all those folders from which i have copied the required file.. so i am checking if a new folder is added at server by checking its name in the foldername.text file.. **

every thing is working fine only problem now is.. suppose file was being downloaded an at that time there was some.network failure.. how will i make sure that i have completely downloaded the file.... i tried to use md5sum and chksum but it was taking to long to compute on mounted folder. please help

here is my script..

#!/bin/bash
#
# changing the directory to source location 
echo " ########### " >> /tempdir/pvmscript/scriptlog.log
echo `date`>> /tempdir/pvmscript/scriptlog.log
echo " script is strting " >> /tempdir/pvmscript/scriptlog.log
cd /var/mountpt/pvm-vmware
#
# array to hold the name of last five folders of the source location
declare -a arr
i=0
for folder in `ls -1 | tail -5 `; do
arr[i]=$folder
#echo $folder
i=$((i+1))
done
echo " array initialised " >> /tempdir/pvmscript/scriptlog.log
#
#now for these 5 folders we will check if their name is present in the list of copied         
#  folder names
#
echo " checking for the folder name in list " >> /tempdir/pvmscript/scriptlog.log
## $(seq $((i-1)) -1 0 
for j in $(seq $((i-1)) -1 0  ) ; do
var3=${arr[$j]}
#var4=${var3//./}
echo " ----------------------------------------" >>  /tempdir/pvmscript/scriptlog.log
echo " the folder name is $var3" >> /tempdir/pvmscript/scriptlog.log
#
# checking if the folder name is present in the stored list of folder names or not
#
#
foldercheck=$(grep $var3 /tempdir/pvmscript/foldernames.txt | wc -l)
#
if test $foldercheck -eq 1
then 
echo " the folder $var3 is present in the list so will not copy it " >>  /tempdir/pvmscript/scriptlog.log
foldercheck=" "
continue
else
#
echo " folder $var3 is not present in the list so checking if it has the debug.iso file ">> /tempdir/pvmscript/scriptlog.log
#enter inside  the new folder in source
#
cd  /var/mountpt/pvm-vmware/$var3
#
# writing the names of content of folder to a temporary text file
#
ls -1 > /var/temporary.txt
#checking if the debug.iso is present in the given folder
var5=$(grep debug.iso /var/temporary.txt | wc -l)
var6=$(grep debug.iso //var/temporary.txt)
#
check1="true"
#
# if the file is present then checking if it is completely uploaded or not  
#
rm -f /var/temporary.txt
if test $var5 -eq 1 
then 
echo " it has the debug.iso checking if upload is complete   ">>/tempdir/pvmscript/scriptlog.log
#
# getting the size of the file we are checking if size of the file is constant or     changing    # after regular interval
#
var7=$(du -s ./$var6 |cut -f 1 -d '.')
#echo " size of the file is $var7"
sleep 50s
#
# checking for 5 times at a regular interval of 50 sec if size changing or not 
#
#
for x in 1 2 3 4 5 ;do
var8=$(du -s ./$var6 |cut -f 1 -d '.')
#
#if size is changing exit and check it after 4 hrs when the script will rerun
#echo " size of the file $x is $var7"
if test $var7 -ne $var8
then
check1="false"
echo " file is still in the prossess of being uploadig so exiting will check after 4 hr  " >> /tempdir/pvmscript/scriptlog.log
break
fi
sleep 50s
done
#
#if the size was constant copy the file to destination
#
if test $check1 = "true" 
then
echo " upload was complete so copying the debug.iso file  " >>  /tempdir/pvmscript/scriptlog.log
cp $var6 /tempdir/PVM_Builds/ 
echo " writing the folder name to the list of folders which we have copied " >>  /tempdir/pvmscript/scriptlog.log
echo $var3 >> /tempdir/pvmscript/foldernames.txt
echo " copying is complete  " >> /tempdir/pvmscript/scriptlog.log
fi
#else 
#echo $foldercheck >> /vmfs/volumes/Storage1/PVM_Builds/foldernames.txt
else
echo " it do not have the debug.iso file so leaving the directory "  >>/tempdir/pvmscript/scriptlog.log
echo $var3 >> /tempdir/pvmscript/foldernames.txt
echo 
fi
#rm -f /var/temporary.txt
fi
done

Some comments and request for clarifications here, see below the break for one possible answer.

(Nice job updating your question.)

How big are these files?

Are these files that you have any control over the start-time for their creation (database backups,for example).

It would also help to have a few more details these files, ie size, MB, GB, TB, PB? and the source that creates them, db-backup, or???.

Are your concerns theoretical, proactive explorations for worst-case-scenarios, or if you have real problems, how often and what are the consequences?

Is your SLA an unrealistic/unattainable management pipe dream? If so then you have to start creating documentation to show that the current system will require X amount of additional resources (people, hardware, programming,etc) to correct deficiencies in your system.


If the files being transfered are datafiles created by a source system, one technique is to have the source system create a 'flag' file that is sent after the main file is sent.

It could contain details like

  filename : TradeData_2012-04-13.dat
  recCount : 777777
  fileSize : 37604730291
  workOfDate: 2012-04-12
  md5sum    : ....

So, now your systems waits to find that the flag file has been delivered, becuase you're using a standard naming convention for each file that you receive, and you use a stand date-stamp embedded in the file. When the file arrives, your script calculates each relevant detail and compares them to the values stored in the flag file.

If you can't arrange this level of detail, at least generic flag file, per day-per file, OR per daily batch of files (sent when all files are done) could be followed with tests that compare the new files against a set of tests that makes sense for your particular situation, ... some of the following:

  • file must be at least X big
  • file must be at least N records
  • file can never be smaller than yesterdays file
  • etc

Then your defense is "we don't have complete control over the files, but we checked them for X,Y,Z and it passed those tests, that is why we loaded them".


While rsync could be good, I don't see how, given some of the scenarios mentioned, you'd ever be sure that it was safe to start loading the file, as rsync might start adding more data to the file.


Reading through your script, if you can't get a detailed flag file from your source, you're on the right track. Glenn Jackman's solution looks to accomplish the same goal with less code. You could put that inside a scriptFile 'getRemotedata.sh' or similar, and put it in a while loop that only exits when the 'getRemotedata.sh' exits with success. I guess I would want some type of notification that it is has spent 3*normalTime running. But it can get very complex when you try to cover all conditions. There are 3rd party tools that can manage file downloads, but we never had the budget to buy them, so I can't recommend any.

whew

I hope this helps.


PS Welcome to StackOverflow (SO) Please remeber to read the FAQs, http://tinyurl.com/2vycnvr , vote for good Q/A by using the gray triangles, http://i.imgur.com/kygEP.png , and to accept the answer that bes solves your problem, if any, by pressing the checkmark sign, http://i.imgur.com/uqJeW.png

The FTP protocol is not robust enough. It does not deal with atomicity and there's no way to know if a file is still being uploaded while you download it. If you need this functionality you need to investigate using rsync for both downloading AND uploading.

#!/bin/sh
if mkdir /tmp/download_in_process 2>/dev/null; then
    echo "cannot start, download in process"
    exit 1
fi

latest=$(ftp hostname << END1 | tail -1
user user1 pass1
cd /dir1/dir2
ls
END1
)

ftp hostname << END2
user user1 pass1
cd /dir1/dir2/$latest
get file1.iso
END2

rmdir /tmp/download_in_process

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM