简体   繁体   English

Python zipfile.ZipFile 压缩损坏的文件

[英]Python zipfile.ZipFile zips a corrupt file

I have a Django view which users can call to zip files at my local server.我有一个 Django 视图,用户可以在我的本地服务器上调用 zip 文件。 It uses zipfile.ZipFile to compresses multiple files into a single zip as follows:它使用zipfile.ZipFile将多个文件压缩成一个 zip 如下:

with ZipFile(my_dir + 'folder.zip', 'w') as zipObj:
                zipObj.write(my_dir + '1.json')
                zipObj.write(my_dir + '2.json')

Then I return this file to the user in response:然后我将此文件返回给用户作为响应:

folder_file = open(full_path, "r", encoding='Cp437')
            response = HttpResponse(FileWrapper(folder_file), content_type='application/zip')

But the downloaded file is corrupt, I can't open it using ubuntu archive manager.但是下载的文件已损坏,我无法使用 ubuntu 存档管理器打开它。

Then when i try to unzip the file using python with the same package in my django server, I still get the error:然后,当我尝试在我的 django 服务器中使用 python 和相同的 package 解压缩文件时,我仍然收到错误消息:

with ZipFile(file_path, 'r') as zip_ref:
            zip_ref.extractall(my_dir)

The error I get is:我得到的错误是:

  File ".../views.py", line 38, in post
    with ZipFile(file_path, 'r') as zip_ref:
  File "/usr/lib/python3.8/zipfile.py", line 1269, in __init__
    self._RealGetContents()
  File "/usr/lib/python3.8/zipfile.py", line 1354, in _RealGetContents
    fp.seek(self.start_dir, 0)
OSError: [Errno 22] Invalid argument

Any idea what am I doing wrong here?知道我在这里做错了什么吗?

This should be a comment, but i think it's to long.这应该是一条评论,但我认为它太长了。

I think you should have a look at your paths, because wrong path can lead to unwanted behaviour.我认为你应该看看你的路径,因为错误的路径会导致不需要的行为。 From the zipfile write :zipfile 写

Archive names should be relative to the archive root, that is, they should not start with a path separator.存档名称应该相对于存档根目录,也就是说,它们不应该以路径分隔符开头。

and from zipfile extratc (all) :zipfile 提取(全部)

If a member filename is an absolute path, a drive/UNC sharepoint and leading (back)slashes will be stripped, eg: ///foo/bar becomes foo/bar on Unix, and C:\foo\bar becomes foo\bar on Windows. And all ".." components in a member filename will be removed, eg: ../../foo../../ba..r becomes foo../ba..r.如果成员文件名是绝对路径,驱动器/UNC sharepoint 和前导(反)斜杠将被删除,例如:///foo/bar 在 Unix 上变为 foo/bar,而 C:\foo\bar 变为 foo\bar在 Windows 上。成员文件名中的所有“..”组件将被删除,例如:../../foo../../ba..r 变为 foo../ba..r。 On Windows illegal characters (:, <, >, |, ", ?, and *) replaced by underscore (_). Windows 上的非法字符(:、<、>、|、"、? 和 *)替换为下划线 (_)。

So make shure you use a correct path.所以确保你使用正确的路径。 And make shure they do not have problematic characters (like wildcards or backslashes) like here并确保他们没有像这里那样有问题的字符(如通配符或反斜杠)

Maybe you should test with other (un-) zip tootls to see if it makes a difference, Sometimes they are more concrete ( Like here )也许你应该用其他(非)zip 工具进行测试,看看它是否有所作为,有时它们更具体( 像这里一样

you can try something like this.你可以尝试这样的事情。

zipf = ZipFile("whatever.zip", "w")

for file in files_to_add:
    this_file = urlopen(file).read()
    this_filename = 'file name.json'
    zipf.writestr(this_filename, this_file)

zipf.close()

response = HttpResponse(io.open("whatever.zip", mode="rb").read(), content_type="application/zip")
response["Content-Disposition"] = "attachment; filename=whatever_name_you_want.zip"

return response

Open the zip file in binary mode when creating the HttpResponse to avoid errors whith newline conversion:创建 HttpResponse 时以二进制模式打开 zip 文件以避免换行符转换错误:

folder_file = open(full_path, "rb")
        response = HttpResponse(FileWrapper(folder_file), content_type='application/zip')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM