简体   繁体   English

Python ftplib:上传到ftp的bz2文件有时会损坏

[英]Python ftplib: bz2 files uploaded to a ftp sometimes become corrupted

I am using ftplib module to upload many bz2 files to a ftp (about 1000 files per day, and every file is about 5 MB, storing a numpy array). 我正在使用ftplib模块将许多bz2文件上传到ftp(每天大约1000个文件,每个文件大约5 MB,存储一个numpy数组)。 Once in a while, some uploaded files are corrupted. 有时,某些上载的文件已损坏。 When I try to read it using bz2 and numpy, I get the error "IOError: invalid data stream". 当我尝试使用bz2和numpy读取它时,出现错误“ IOError:无效的数据流”。 If I try to uncompress it using software such as WinRAR, I get the message "Checksum error in filename . The file is corrupted." 如果我尝试使用WinRAR之类的软件解压缩它,则会收到消息“ 文件名中的校验和错误。文件已损坏”。

The code for uploading the data has nothing special. 上载数据的代码没有什么特别的。 Basically it looks like this: 基本上看起来像这样:

while True:
    try:
        fidFile = open(fileName, 'rb')
        ftp.storbinary('STOR '+fileName, fidFile)
        fidFile.close()
        break
    except:
        continue

For the corrupted files, if I upload them again using the same code, most of the time I can get the good files. 对于损坏的文件,如果我使用相同的代码再次上传它们,则大多数时候我可以得到好的文件。

Using a different ftp does not eliminate this problem. 使用其他ftp不能消除此问题。

I also noticed that the corrupted file has exactly the same bytes as the good file. 我还注意到,损坏的文件与正常文件的字节完全相同。 I guess all necessary information has been uploaded, so I really do not understand why the file is corrupted. 我想所有必要的信息都已上传,所以我真的不明白为什么文件损坏了。

Workaround for this problem can be: 解决此问题的方法可以是:

def upload(fileName):
    try:
        fidFile = open(fileName, 'rb')
        ftp.storbinary('STOR '+fileName, fidFile)
        fidFile.close()
    except Exception as e:
        print(e)
        upload(fileName)
        # Can be improved by restricting retry limit.

It keeps trying to upload the file and will go off ones it is uploaded. 它会继续尝试上传文件,并且会停止上传文件。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM