简体   繁体   中英

python: copy only missing files from FTP dirs and sub-dirs to local dirs and sub-dirs

the problem is:

I have a local directory '/local' and a remote FTP directory '/remote' full of subdirectories and files. I want to check if there are any new files in the sub-directories of '/remote' . If there are any, then copy them over to '/local' .

the question is:

am I using the right strategy? Is this totally overkill and is there a much faster pythonic way to do it? DISCLAIMER: I'm a python n00b trying to learn. So be gentle ... =) This is what I've tried:

Create a list of all files in /local and its sub-dirs.

LocalFiles=[]
for path, subdirs, files in os.walk(localdir): 
    for name in files:                     
        LocalFiles.append(name)

Do some ftplib magic, using ftpwalk() and copying its results to a list of the form:

 RemoteFiles=[['/remote/dir1/','/remote/dir1/','/remote/dir3/'],['file1.txt','file12.py','file3.zip']]

so I have the directory corresponding to each file. Then see which files are missing by comparing the lists of filenames,

missing_files= list(set(RemoteFiles[1]) - set(LocalFiles))  

and once I've found their name, I try to find the directory that came with that name,

for i in range(0,len(missing_files)):
    theindex=RemoteFiles[1].index(missing_files[i])

which lets me build the list of missing files and their directories,

MissingDirNFiles.append([remotefiles[0][theindex],remotefiles[1][theindex]])

so I can copy them over with ftp.retrbinary . Is this a reasonable strategy? Any tips, comments and advice is appreciated [especially for large numbers of files].

If you get the modification time of both the local and the remote FTP directories and store it in a data base, you could prune the search for new or modified files. This should speed up the sync procedure significantly.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM