简体   繁体   English

如何在Python中确定打开文件的大小?

[英]How do I determine an open file's size in Python?

There's a file that I would like to make sure does not grow larger than 2 GB (as it must run on a system that uses ext 2). 有一个文件,我想确保不会超过2 GB(因为它必须在使用ext 2的系统上运行)。 What's a good way to check a file's size bearing in mind that I will be writing to this file in between checks? 检查文件大小的好方法是什么,记住我将在两次检查之间写入此文件? In particular, do I need to worry about buffered, unflushed changes that haven't been written to disk yet? 特别是,我是否需要担心尚未写入磁盘的缓冲,未刷新的更改?

Perhaps not what you want, but I'll suggest it anyway. 也许不是你想要的,但无论如何我都会建议。

import os
a = os.path.getsize("C:/TestFolder/Input/1.avi")

Alternatively for an opened file you can use the fstat function, which can be used on an opened file. 或者对于打开的文件,您可以使用fstat函数,该函数可用于打开的文件。 It takes an integer file handle, not a file object, so you have to use the fileno method on the file object: 它需要一个整数文件句柄,而不是文件对象,因此您必须在文件对象上使用fileno方法:

a = open("C:/TestFolder/Input/1.avi")
b = os.fstat(a.fileno()).st_size

os.fstat(file_obj.fileno()).st_size should do the trick. os.fstat(file_obj.fileno()).st_size应该可以解决问题。 I think that it will return the bytes written. 我认为它将返回写入的字节。 You can always do a flush before hand if you are concerned about buffering. 如果您担心缓冲,可以随时进行冲洗。

I'm not familiar with python, but doesn't the stream object (or whatever you get when opening a file) have a property that contains the current position of the stream? 我不熟悉python,但是流对象(或打开文件时得到的任何对象)是否具有包含流的当前位置的属性?

Similar to what you get with the ftell() C function, or Stream.Position in .NET. 类似于ftell() C函数或.NET中的Stream.Position所获得的。

Obviously, this only works if you are positioned at the end of the stream, which you are if you are currently writing to it. 显然,这只有在您定位在流的末尾时才有效,如果您当前正在写入它。

The benefit of this approach is that you don't have to close the file or worry about unflushed data. 这种方法的好处是您不必关闭文件或担心未刷新的数据。

You could start with something like this: 你可以从这样的事情开始:

class TrackedFile(file):
    def __init__(self, filename, mode):
        self.size = 0
        super(TrackedFile, self).__init__(filename, mode)
    def write(self, s):
        self.size += len(s)
        super(TrackedFile, self).write(s)

Then you could use it like this: 然后你可以像这样使用它:

>>> f = TrackedFile('palindrome.txt', 'w')
>>> f.size
0
>>> f.write('A man a plan a canal ')
>>> f.size
21
>>> f.write('Panama')
27

Obviously, this implementation doesn't work if you aren't writing the file from scratch, but you could adapt your __init__ method to handle initial data. 显然,如果您不是从头开始编写文件,则此实现不起作用,但您可以调整__init__方法来处理初始数据。 You might also need to override some other methods: writelines , for instance. 您可能还需要覆盖其他一些方法: writelineswritelines

This works regardless of encoding, as strings are just sequences of bytes. 无论编码如何,这都有效,因为字符串只是字节序列。

>>> f2 = TrackedFile('palindrome-latin1.txt', 'w')
>>> f2.write(u'A man a plan a canál '.encode('latin1')
>>> f3 = TrackedFile('palindrome-utf8.txt', 'w')
>>> f3.write(u'A man a plan a canál '.encode('utf-8'))
>>> f2.size
21
>>> f3.size
22

Though this is an old question, I think that Isak has the simplest solution. 虽然这是一个老问题,但我认为Isak有最简单的解决方案。 Here's how to do it in Python: 以下是如何在Python中执行此操作:

# Assuming f is an open file
>>> pos = f.tell()  # Save the current position
>>> f.seek(0, 2)  # Seek to the end of the file
>>> length = f.tell()  # The current position is the length
>>> f.seek(pos)  # Return to the saved position
>>> print length
1024

最可靠的是创建一个包装类,它可以在打开文件时检查文件的大小,跟踪写入和查找操作,根据这些操作计算当前大小并防止超出大小限制。

Or, if the file is already open: 或者,如果文件已经打开:

>>> fsock = open('/etc/hosts', 'rb').read()
>>> len(fsock)
444

That's how many bytes the file is. 那是文件的字节数。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM