简体   繁体   English

在Python中从数据源创建多个文件

[英]Creating Multiple files from a data source in Python

I have a data source I'm working with in Python. 我有一个使用Python处理的数据源。 I'd like to save that data to a files such that once a threshold is hit (ie: 1K, 1M) the file is closed and a new file is automatically opened to save the data. 我想将数据保存到文件中,以便一旦达到阈值(即:1K,1M),文件就会关闭,并自动打开一个新文件来保存数据。

ie: 即:

<file handler with buffer 200>
file.write('a'*1000)

The line above would generate 5 files based on the data. 上面的行将基于数据生成5个文件。 Is there a pre-built python library that will handle this, or do I need to write one myself? 是否有预编译的python库可以处理此问题,还是我需要自己编写一个?

If a logger framework is too much, you can do it yourself -- shouldn't need more than a dozen lines of code or so. 如果一个记录器框架太多,您可以自己完成-不需要多于十几行的代码。 The easiest way to get the size of your file is by calling the tell() method of your open file descriptor. 获取文件大小的最简单方法是调用打开的文件描述符的tell()方法。

You could also keep track of the bytes being output, but this requires additional logic if your program sometimes appends to a pre-existing file. 您还可以跟踪正在输出的字节,但是如果您的程序有时会追加到预先存在的文件中,则这需要其他逻辑。

A quick search on pypi brings up this which might do what you want, but otherwise I'd suggest writing it yourself, it would be a fairly simple tools to write. PyPI上快速搜索带来了这个可能做你想做的,但除此之外,我建议你自己写的话,这将是一个相当简单的工具来编写。

I haven't tested it, but here's a very simple implementation that should do it (python3). 我还没有测试过,但是这是一个应该做的非常简单的实现(python3)。

class RotatingFile:

    def __init__(self, basename, size, binary=False):
        self.basename = basename
        self.size = size
        self.counter = 0
        if binary:
            self.buffer = b''
        else:
            self.buffer = ''

    def write(self, data)
        self.buffer += data
        if len(self.buffer) >= self.size:
            data = self.buffer[:self.size]
            self.buffer = self.buffer[self.size:]
            name = self.basename + str(self.counter)
            with open(name) as f:
                f.write(data)
            self.counter += 1

    def flush(self):
        name = self.basename + str(self.counter)
        with open(name) as f:
            f.write(self.buffer)

So this should write to 6 files: 因此,这应该写入6个文件:

>>> f = RotatingFile('myfile', 1000)
>>> f.write('a' * 5500)
>>> f.flush()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM