[英]How to use csv.DictReader on a tarfile object in Python 3.6?
Here's the issue I'm running into: 这是我遇到的问题:
Error: iterator should return strings, not bytes (did you open the file in text mode?)
The code that's causing this looks something like: 导致这种情况的代码如下所示:
for fileinfo in tarfile.open(filename):
f = t.extractfile(fileinfo)
reader = csv.DictReader(f)
reader.fieldnames
The trouble seems to be that the extractfile()
method produces a io.BufferedReader that is a very basic file-like object and has no high-level text interface. 问题似乎在于
extractfile()
方法生成的io.BufferedReader是一个非常基本的类似于文件的对象,并且没有高级文本接口。
What would be a good way to handle this? 什么是处理此问题的好方法?
I'm thinking of looking at decoding the bytes from the reader into text but I need to retain streaming because these files are very large. 我正在考虑将阅读器中的字节解码为文本,但是我需要保留流,因为这些文件非常大。 The codebase is Python 3.6 running on Docker/Linux.
代码库是在Docker / Linux上运行的Python 3.6。
Thanks to both @Aran-Fey and @zwer who led me to another StackOverflow question that answered it. 感谢@ Aran-Fey和@zwer,他们将我引向另一个能够解决该问题的StackOverflow问题 。 Here's how:
这是如何做:
for fileinfo in tarfile.open(filename):
with t.extractfile(fileinfo) as f:
ft = codecs.getreader("utf-8")(f)
reader = csv.DictReader(ft)
reader.fieldnames
This seems to work so far. 到目前为止,这似乎可行。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.