简体繁体 English

Python：处理gzip压缩json的正确方法是什么？

[英]Python: what is the correct way to handle gzipped json?

原文 2011-10-10 19:42:05 5 3 python/ json/ gzip/ asp.net-web-api

I've found this snippet , which seems to do the job, but I can't understand why it uses StringIO. 我找到了这个片段，似乎可以完成这项工作，但我无法理解为什么它使用StringIO。 Isn't f already a file-like object? 是不是f已经是一个类似文件的对象？ What is the need to read it, then make it look like a file again, only to read it again? 有什么需要阅读它，然后让它看起来像一个文件，只是再次阅读？ I've tested it (well, a slightly modified version of it), and it doesn't work without StringIO. 我已经测试了它（好吧，它的稍微修改过的版本），如果没有StringIO它就无法运行。

3 个解决方案

Seems to be a flaw in python standard library which is fixed in Python 3.2. 似乎是python标准库中的一个缺陷，它在Python 3.2中得到修复。
see http://www.enricozini.org/2011/cazzeggio/python-gzip/ 见http://www.enricozini.org/2011/cazzeggio/python-gzip/

urllib and urllib2 file objects do not provide a method tell() as requested by gzip. urllib和urllib2文件对象不提供gzip请求的方法tell() 。

It's possible that the gunzip code needs a file-like object that has a seek method, which a HTTP library is very unlikely to provide. gunzip代码可能需要一个类似文件的对象，该对象具有一个HTTP库不太可能提供的seek方法。 What does "doesn't work" mean? 什么“不起作用”是什么意思？ Error message? 错误信息？

If efficiency is your real concern, slightly modify the code so that it uses cStringIO, not StringIO. 如果效率是您真正关心的问题，请稍微修改代码，使其使用cStringIO，而不是StringIO。

The way I read the relevant part of the code says: 我阅读代码相关部分的方式是：

Open an url 打开网址
Download it completely into memory (with the read method) 将其完全下载到内存中（使用read方法）
Store the content in a StringIO object, so that it's available as a file-like object 将内容存储在StringIO对象中，以便它可用作类文件对象
Do the gzip and json stuff with it. 用它做gzip和json的东西。