简体   繁体   English

如何在Python中以相反的顺序读取CSV文件?

[英]How to read a CSV file in reverse order in Python?

I know how to do it for a TXT file, but now I am having some trouble doing it for a CSV file. 我知道如何为TXT文件执行此操作,但现在我在为CSV文件执行此操作时遇到了一些麻烦。

How can I read a CSV file from the bottom in Python? 如何在Python中从底部读取CSV文件?

Pretty much the same way as for a text file: read the whole thing into a list and then go backwards: 与文本文件几乎相同:将整个内容读入列表然后返回:

import csv
with open('test.csv', 'r') as textfile:
    for row in reversed(list(csv.reader(textfile))):
        print ', '.join(row)

If you want to get fancy, you could write a lot of code that reads blocks starting at the end of the file and working backwards, emitting a line at a time, and then feed that to csv.reader , but that will only work with a file that can be seeked, ie disk files but not standard input. 如果你想得到花哨的话,你可以编写很多代码来读取从文件末尾开始并向后工作的块,一次发出一行,然后将其提供给csv.reader ,但这只能用于可以查找的文件,即磁盘文件,但不是标准输入。


Some of us have files that do not fit into memory, could anyone come with a solution that does not require storing the entire file in memory? 我们中的一些人的文件不适合内存,任何人都可以提供一个不需要将整个文件存储在内存中的解决方案吗?

That's a bit trickier. 这有点棘手。 Luckily, all csv.reader expects is an iterator-like object that returns a string (line) per call to next() . 幸运的是,所有csv.reader希望是一个类似迭代器的对象,每次调用next()都会返回一个字符串(行next() So we grab the technique Darius Bacon presented in " Most efficient way to search the last x lines of a file in python " to read the lines of a file backwards, without having to pull in the whole file: 所以我们抓住Darius Bacon以“ 最有效的方式在python中搜索文件的最后x行 ”中提供的技术来向后读取文件的行,而不必拉入整个文件:

import os

def reversed_lines(file):
    "Generate the lines of file in reverse order."
    part = ''
    for block in reversed_blocks(file):
        for c in reversed(block):
            if c == '\n' and part:
                yield part[::-1]
                part = ''
            part += c
    if part: yield part[::-1]

def reversed_blocks(file, blocksize=4096):
    "Generate blocks of file's contents in reverse order."
    file.seek(0, os.SEEK_END)
    here = file.tell()
    while 0 < here:
        delta = min(blocksize, here)
        here -= delta
        file.seek(here, os.SEEK_SET)
        yield file.read(delta)

and feed reversed_lines into the code to reverse the lines before they get to csv.reader , removing the need for reversed and list : 并将reversed_lines给代码以它们到达csv.reader 之前反转这些行,从而不需要reversedlist

import csv
with open('test.csv', 'r') as textfile:
    for row in csv.reader(reversed_lines(textfile)):
        print ', '.join(row)

There is a more Pythonic solution possible, which doesn't require a character-by-character reversal of the block in memory (hint: just get a list of indices where there are line ends in the block, reverse it, and use it to slice the block), and uses chain out of itertools to glue the line clusters from successive blocks together, but that's left as an exercise for the reader. 有一个更可能的Pythonic解决方案,它不需要在内存中逐个字符地反转块(提示:只需获取一个索引列表,其中块中有行结束,反转它,并使用它切片块),并使用itertools chain将来自连续块的线簇粘合在一起,但这留给读者练习。


It's worth noting that the reversed_lines() idiom above only works if the columns in the CSV file don't contain newlines. 值得注意的是,如果CSV文件中的列不包含换行符,则上述的reversed_lines()惯用法才有效。

Aargh! AARGH! There's always something. 总有一些东西。 Luckily, it's not too bad to fix this: 幸运的是,解决这个问题并不算太糟糕:

def reversed_lines(file):
    "Generate the lines of file in reverse order."
    part = ''
    quoting = False
    for block in reversed_blocks(file):
        for c in reversed(block):
            if c == '"':
                quoting = not quoting
            elif c == '\n' and part and not quoting:
                yield part[::-1]
                part = ''
            part += c
    if part: yield part[::-1]

Of course, you'll need to change the quote character if your CSV dialect doesn't use " . 当然,如果您的CSV方言不使用"则需要更改引号字符"

Building on @mike-desimone 's answer. 建立在@ mike-desimone的回答之上。 Here's a solution that provides the same structure as a python file object but is read in reverse, line by line: 这是一个解决方案,提供与python文件对象相同的结构,但是反向逐行读取:

import os

class ReversedFile(object):
    def __init__(self, f, mode='r'):
        """
        Wraps a file object with methods that make it be read in reverse line-by-line

        if ``f`` is a filename opens a new file object

        """
        if mode != 'r':
            raise ValueError("ReversedFile only supports read mode (mode='r')")

        if not type(f) == file:
            # likely a filename
            f = open(f)

        self.file = f
        self.lines = self._reversed_lines()

    def _reversed_lines(self):
        "Generate the lines of file in reverse order."
        part = ''
        for block in self._reversed_blocks():
            for c in reversed(block):
                if c == '\n' and part:
                    yield part[::-1]
                    part = ''
                part += c
        if part: yield part[::-1]

    def _reversed_blocks(self, blocksize=4096):
        "Generate blocks of file's contents in reverse order."
        file = self.file

        file.seek(0, os.SEEK_END)
        here = file.tell()
        while 0 < here:
            delta = min(blocksize, here)
            here -= delta
            file.seek(here, os.SEEK_SET)
            yield file.read(delta)


    def __getattribute__(self, name):
        """ 
        Allows for the underlying file attributes to come through

        """ 
        try:
            # ReversedFile attribute
            return super(ReversedFile, self).__getattribute__(name)
        except AttributeError:
            # self.file attribute
            return getattr(self.file, name)

    def __iter__(self):
        """ 
        Creates iterator

        """ 
        return self

    def seek(self):
        raise NotImplementedError('ReversedFile does not support seek')

    def next(self):
        """
        Next item in the sequence

        """
        return self.lines.next()

    def read(self):
        """
        Returns the entire contents of the file reversed line by line

        """
        contents = ''

        for line in self:
            contents += line

        return contents

    def readline(self):
        """
        Returns the next line from the bottom

        """
        return self.next()

    def readlines(self):
        """
        Returns all remaining lines from the bottom of the file in reverse

        """
        return [x for x in self]

Go for it. 去吧。 This is simple program to reverse the rows from a CSV file. 这是一个简单的程序来反转CSV文件中的行。

import csv
BC_file = open('Master.csv', 'rb')
BC_reader = csv.reader(BC_file)
next(BC_reader)
for row in reversed(list(BC_reader)):
    print row[0]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM