[英]Files uploaded to S3 with S3BotoStorage end up with invalidly escaped content-type meta data
FACEPALM UPDATE : Turns out I had forgotten/overlooked the fact that I was using an older fork of S3BotoStorage from https://github.com/gtaylor/django-athumb as my default storage (even though I had django-storages installed). FACEPALM UPDATE :事实证明我忘记/忽略了这样一个事实,即我使用来自https://github.com/gtaylor/django-athumb的旧版S3BotoStorage作为我的默认存储(即使我安装了django-storage)。 The current version of django-storages doesn't suffer from this problem.
当前版本的django-storages没有遇到这个问题。 The problem was that the content-type headers were unicode when they hit boto, and boto escapes unicode using
urllib.quoteplus
before sending it on to AWS. 问题是内容类型头文件在击中boto时是unicode,而boto在将它发送到AWS之前使用
urllib.quoteplus
转义unicode。 This isn't really Boto's fault since headers have to be converted to non-unicode strings somehow per HTTP. 这不是Boto的错,因为每个HTTP都必须以某种方式将头转换为非unicode字符串。 For a more indepth analysis see https://github.com/boto/boto/issues/1669 .
有关更深入的分析,请参阅https://github.com/boto/boto/issues/1669 。
Original Question 原始问题
I am using django_storage's S3BotoStorage in conjunction with a FileField to upload files to Amazon S3. 我正在使用django_storage的S3BotoStorage和FileField将文件上传到Amazon S3。 Here's my field:
这是我的领域:
downloadable_file = FileField(max_length=255, upload_to="widgets/filedownloads", verbose_name="file")
In settings: 在设置中:
DEFAULT_FILE_STORAGE = 'storages.backends.s3boto.S3BotoStorage'
Everything works as far as the uploading/downloading goes. 在上传/下载过程中一切正常。
However , the files are getting stored in my bucket with an incorrect content-type. 但是 ,文件存储在我的存储桶中,内容类型不正确。 WhenI look at the metadata for the files in my AWS S3 console, the Content-Type of the file is showing up as "application%2Fpdf" instead of "application/pdf" which it should be.
当我查看AWS S3控制台中文件的元数据时,文件的Content-Type显示为“application%2Fpdf”而不是“application / pdf”。
In case you say it shouldn't matter, it does matter. 如果你说它无关紧要,那就重要了。 Google Chrome's built-in pdf reader will hang on pdf's with an invalid content-type, and a client brought this to my attention.
谷歌Chrome的内置pdf阅读器将挂在pdf上,内容类型无效,客户端会引起我的注意。
Here's an example of a file uploaded through django-storages/boto. 这是通过django-storages / boto上传的文件的示例。 If you're using chrome's built-in pdf reader I assume it hangs, like it does for me and the customer who reported this.
如果您正在使用chrome的内置pdf阅读器,我认为它会挂起,就像我和报告此内容的客户一样。 If you're using a non-chrome browser, or the adobe plugin, or downloading the file to disk you'll probably be fine.
如果您使用的是非Chrome浏览器或adobe插件,或者将文件下载到磁盘,您可能会没问题。
If I manually change the content-type metadata via the AWS console to 'application/pdf' (one of the standard choices it provides) then its fine. 如果我通过AWS控制台手动将内容类型元数据更改为'application / pdf'(它提供的标准选项之一),那么就可以了。
I assume this is a bug with something internal with the way boto constructs the AWS policy document to upload the file, since I'm not doing anything outside of the standard usage here. 我认为这是一个内部错误,boto构建AWS策略文档以上传文件的方式,因为我没有做任何超出标准用法的事情。 However, I've stepped through boto code and can't find where it actually does the escaping.
但是,我已经介绍了boto代码,无法找到它实际逃逸的位置。
Can someone either suggest a work around, or guide me to the offending code in boto so I can patch it and submit a pull request? 有人可以建议一个解决方法,或者引导我查看boto中的违规代码,以便我可以修补它并提交拉取请求吗?
boto==2.9.5 django-storages==1.1.8 boto == 2.9.5 django-storages == 1.1.8
Not a direct answer to your question, but maybe a useful workaround. 不是您的问题的直接答案,但可能是一个有用的解决方法。 I was having issues using django-storages with S3.
我在使用S3的django-storage时遇到了问题。 I ended up trying cuddly-buddly and have been quite happy with it.
我最后还是尝试了可爱的兄弟 ,并对此感到非常满意。 The author based it on the S3 module from django-storages and has added quite a number of fixes.
作者基于django-storages的S3模块,并添加了许多修复程序。 I browsed through the cuddly-buddly commits and there were some modifications affecting the content-type header, but I can't test with PDF uploads without setting up a new django project.
我浏览了可爱的提交,并且有一些修改影响了内容类型标题,但是我无法在没有设置新的django项目的情况下测试PDF上传。 However, I can verify that all my files uploaded through Django do not have mangled slashes in the content-type field in the S3 Metadata.
但是,我可以验证通过Django上传的所有文件在S3元数据的content-type字段中没有损坏的斜杠。
If for some reason you can't change over to cuddly-buddly for testing, let me know and I'll try to setup a simple Django project to upload some PDFs. 如果由于某种原因你无法转换为可爱的兄弟进行测试,请告诉我,我将尝试设置一个简单的Django项目来上传一些PDF。
The problem was that I was using a forked/obsolete version of django storages which did not properly convert content-type headers to strings from unicode before sending them to boto, which converts unicode strings to ascii strings (as required for HTTP headers) by using urllib's quoteplus
escape mechanism. 问题是我使用的是django存储的分叉/过时版本,它在将内容类型头文件发送到boto之前没有正确地将内容类型头文件转换为字符串,后者通过使用将unicode字符串转换为ascii字符串(根据HTTP头文件的要求) urllib的
quoteplus
转义机制。 The problem was fixed by switching to the current version of django-storages. 通过切换到当前版本的django-storage来解决该问题。
For a more detailed analysis of the issue see: https://github.com/boto/boto/issues/1669#issuecomment-27132112 有关该问题的更详细分析,请参阅: https : //github.com/boto/boto/issues/1669#issuecomment-27132112
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.