[英]Uploading large zip files to a website using Python
I have the following problem: I need to upload large.zip-files (usually >500MB with a maximum of ca 5GB) to a website which then processes these files.我遇到以下问题:我需要将大型 .zip 文件(通常 >500MB,最大 ca 5GB)上传到一个网站,然后由该网站处理这些文件。 I do this in
Python 2.7.16
on Windows 32-Bit.我在
Python 2.7.16
在 Windows 32 位上执行此操作。 Sadly I cannot change my setup (from 32-Bit to 64-Bit) nor can I install additional Python plugins (I have requests, urllib and urllib2 and several othersinstalled) due to company restrictions.遗憾的是,由于公司限制,我无法更改我的设置(从 32 位到 64 位),也无法安装额外的 Python 插件(我安装了请求、urllib 和 urllib2 以及其他几个插件)。 My code looks like this now:
我的代码现在看起来像这样:
import requests
FileList=["C:\File01.zip", "C:\FileA02.zip", "C:\UserFile993.zip"]
UploadURL = "https://mywebsite.com/submitFile"
for FilePath in FileList:
print("Upload file: "+str(FilePath))
session = requests.Session()
with open(FilePath, "rb") as file:
session.post(UploadURL,data={'file':'Send file'},files={'FileToBeUploaded':FilePath})
print("Upload done: "+str(FilePath))
session.close()
Since my FileList
is quite long (>100 entries), I just pasted here an excerpt of it.因为我的
FileList
很长(>100 个条目),所以我只是在此处粘贴了它的摘录。 The code above works well if there are file below 600MB.如果文件小于 600MB,上面的代码运行良好。 Any file above that will throw me this error:
上面的任何文件都会抛出这个错误:
File "<stdin>", line 1, in <module>
File "C:\Users\AAA253\Desktop\DingDong.py", line 39, in <module>
session.post(UploadURL,data={'file':'Send file'},files={'FileToBeUploaded':FilePath})
File "C:\Python27\lib\site-packages\requests\sessions.py", line 522, in post
return self.request('POST', url, data=data, json=json, **kwargs)
File "C:\Python27\lib\site-packages\requests\sessions.py", line 461, in request
prep = self.prepare_request(req)
File "C:\Python27\lib\site-packages\requests\sessions.py", line 394, in prepare_request
hooks=merge_hooks(request.hooks, self.hooks),
File "C:\Python27\lib\site-packages\requests\models.py", line 297, in prepare
self.prepare_body(data, files, json)
File "C:\Python27\lib\site-packages\requests\models.py", line 455, in prepare_body
(body, content_type) = self._encode_files(files, data)
File "C:\Python27\lib\site-packages\requests\models.py", line 158, in _encode_files
body, content_type = encode_multipart_formdata(new_fields)
File "C:\Python27\lib\site-packages\requests\packages\urllib3\filepost.py", line 86, in encode_multipart_formdata
body.write(data)
MemoryError
I checked already the forum here to find some solutions, but sadly I could not find any suitable solution.我已经检查了这里的论坛以找到一些解决方案,但遗憾的是我找不到任何合适的解决方案。 Anybody has an idea on how to get this done?
有人知道如何完成这项工作吗? Could it be made by loading the file in chunks?
可以通过分块加载文件来实现吗? If so, how to upload the file in chunks, so that the server does not "cancel" the operation?
如果是这样,如何分块上传文件,使服务器不“取消”操作?
Edit: using the answer from @AKX I use this code:编辑:使用@AKX 的答案我使用这段代码:
import requests
from requests_toolbelt.multipart import encoder
FileList=["C:\File01.zip", "C:\FileA02.zip", "C:\UserFile993.zip"]
UploadURL = "https://mywebsite.com/submitFile"
for FilePath in FileList:
session = requests.Session()
with open(FilePath, 'rb') as f:
form = encoder.MultipartEncoder({"documents": (FilePath, f, "application/octet-stream"),"composite": "NONE",})
headers = {"Prefer": "respond-async", "Content-Type": form.content_type}
resp = session.post(UploadURL,data={'file':'Send file'},files={'FileToBeUploaded':form})
session.close()
Nevertheless I get nearly the same errors:尽管如此,我还是得到了几乎相同的错误:
File "<stdin>", line 1, in <module>
File "C:\Users\AAA253\Desktop\DingDong.py", line 48, in <module>
resp = session.post(UploadURL,data={'file':'Send file'},files={'FileToBeUploaded':form})
File "C:\Python27\lib\site-packages\requests-2.24.0-py2.7.egg\requests\sessions.py", line 578, in post
return self.request('POST', url, data=data, json=json, **kwargs)
File "C:\Python27\lib\site-packages\requests-2.24.0-py2.7.egg\requests\sessions.py", line 516, in request
prep = self.prepare_request(req)
File "C:\Python27\lib\site-packages\requests-2.24.0-py2.7.egg\requests\sessions.py", line 459, in prepare_request
hooks=merge_hooks(request.hooks, self.hooks),
File "C:\Python27\lib\site-packages\requests-2.24.0-py2.7.egg\requests\models.py", line 317, in prepare
self.prepare_body(data, files, json)
File "C:\Python27\lib\site-packages\requests-2.24.0-py2.7.egg\requests\models.py", line 505, in prepare_body
(body, content_type) = self._encode_files(files, data)
File "C:\Python27\lib\site-packages\requests-2.24.0-py2.7.egg\requests\models.py", line 159, in _encode_files
fdata = fp.read()
File "build\bdist.win32\egg\requests_toolbelt\multipart\encoder.py", line 314, in read
File "build\bdist.win32\egg\requests_toolbelt\multipart\encoder.py", line 194, in _load
File "build\bdist.win32\egg\requests_toolbelt\multipart\encoder.py", line 256, in _write
File "build\bdist.win32\egg\requests_toolbelt\multipart\encoder.py", line 552, in append
MemoryError
You will more likely than not need the requests-toolbelt
streaming MultipartEncoder.您很可能需要
requests-toolbelt
streaming MultipartEncoder。
Even if your company restrictions forbid installing new packages, you can likely vendor in the parts of requests_toolbelt
you need (maybe the whole package) into your project's directory.即使您的公司限制禁止安装新包,您也可以将您需要的
requests_toolbelt
部分(可能是整个包)供应到您的项目目录中。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.