简体   繁体   English

遍历 csv 文件

[英]Iterating through a csv file

Hi I am trying to iterate through a csv file but I cannot get it to work somehow.嗨,我正在尝试遍历 csv 文件,但我无法让它以某种方式工作。 I followed the python docs but I am still not able to iterate through it.我遵循了 python 文档,但我仍然无法遍历它。 I have a gzipped csv file that I work with with this format:我有一个压缩的 csv 文件,我使用这种格式:

2015-01-10 00:00:05;32

As you can see it's delimited with a ';'.如您所见,它用';'分隔。

Here is my code to run though it (simplified)这是我要运行的代码(简化)

 gzip_fd = gzip.decompress(gzip_file).decode(encoding='utf8')
 csv_data = csv.reader(gzip_fd, delimiter=';', lineterminator='\n')
 for data in csv_data:
     print(data)

But when I want to work with data it only contains the first character (like: 2) and not the first part of the csv data that I need.但是当我想处理数据时,它只包含第一个字符(如:2),而不是我需要的 csv 数据的第一部分。 Anyone here that had the same issues?这里有人有同样的问题吗? I also tried csv.DictReader but with no success.我也试过 csv.DictReader 但没有成功。

Even if your snippet was fixed to work, it would buffer all data in the memory, which might not scale well for very large files.即使您的代码段已修复,它也会缓冲 memory 中的所有数据,这对于非常大的文件可能无法很好地扩展。

Gzipped data can also be iterated on-the-fly -- the following works for me on CPython 3.8: Gzipped 数据也可以即时迭代——以下在 CPython 3.8 上适用于我:

import csv
import gzip


with gzip.open('test.csv.gz', 'r') as gzipped:
    reader = csv.reader(gzipped, delimiter=';', lineterminator='\n')

    for line in reader:
        print(line)
['2015-01-10 00:00:05', '32']

<...>

Update : As per comments below, my snippet does not work on older Python versions (reproduced on CPython 3.5).更新:根据下面的评论,我的代码段不适用于较旧的 Python 版本(在 CPython 3.5 上转载)。

You can use io.TextIOWrapper to achieve the same effect:你可以使用io.TextIOWrapper来达到同样的效果:

import csv
import io
import gzip


with gzip.open('test.csv.gz', 'rb') as gzipped:
    reader = csv.reader(io.TextIOWrapper(gzipped), delimiter=';',
                        lineterminator='\n')

    for line in reader:
        print(line)

So I fixed my issue, the issue was that I didn't split the string that I get (can't do gzip.open because it isn't a file but rather a bytes string of the gzipped file所以我解决了我的问题,问题是我没有拆分我得到的字符串(不能做 gzip.open 因为它不是文件,而是 gzip 压缩文件的字节字符串

Here is the fix to my problem:这是我的问题的解决方法:

gzip_fd = gzip.decompress(compressed_data).decode(encoding='utf-8').split('\n')
self.data = csv.reader(gzip_fd, delimiter=';', lineterminator='\n')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM