简体   繁体   English

在 Python 中检查对象是否类似于文件

[英]Check if object is file-like in Python

File-like objects are objects in Python that behave like a real file, eg have a read() and a write method(), but have a different implementation from file . 类文件对象是 Python 中的对象,其行为类似于真实文件,例如具有 read() 和 write method(),但与file具有不同的实现。 It is realization of the Duck Typing concept.它是鸭子打字概念的实现。

It is considered a good practice to allow a file-like object everywhere where a file is expected so that eg a StringIO or a Socket object can be used instead a real file.在需要文件的任何地方都允许类似文件的对象被认为是一种很好的做法,这样例如可以使用StringIO或 Socket 对象代替真实文件。 So it is bad to perform a check like this:所以执行这样的检查是不好的:

if not isinstance(fp, file):
   raise something

What is the best way to check if an object (eg a parameter of a method) is "file-like"?检查对象(例如方法的参数)是否“类文件”的最佳方法是什么?

For 3.1+, one of the following:对于 3.1+,以下选项之一:

isinstance(something, io.TextIOBase)
isinstance(something, io.BufferedIOBase)
isinstance(something, io.RawIOBase)
isinstance(something, io.IOBase)

For 2.x, "file-like object" is too vague a thing to check for, but the documentation for whatever function(s) you're dealing with will hopefully tell you what they actually need;对于 2.x,“类文件对象”太模糊,无法检查,但是您正在处理的任何函数的文档都有望告诉您它们实际需要什么; if not, read the code.如果没有,请阅读代码。


As other answers point out, the first thing to ask is what exactly you're checking for.正如其他答案所指出的那样,首先要问的是您正在检查什么。 Usually, EAFP is sufficient, and more idiomatic.通常,EAFP 就足够了,而且更加地道。

The glossary says "file-like object" is a synonym for "file object", which ultimately means it's an instance of one of the three abstract base classes defined in the io module , which are themselves all subclasses of IOBase .术语表说“类文件对象”是“文件对象”的同义词,这最终意味着它是io模块中定义的三个抽象基类之一的实例,它们本身都是IOBase子类。 So, the way to check is exactly as shown above.所以,检查的方法完全如上图所示。

(However, checking IOBase isn't very useful. Can you imagine a case where you need to distinguish an actual file-like read(size) from some one-argument function named read that isn't file-like, without also needing to distinguish between text files and raw binary files? So, really, you almost always want to check, eg, "is a text file object", not "is a file-like object".) (但是,检查IOBase不是很有用。你能想象这样一种情况,你需要区分一个实际的类似文件的read(size)和一些名为read单参数函数,而不是像文件一样的,而不需要区分文本文件和原始二进制文件?所以,实际上,您几乎总是想检查,例如,“是一个文本文件对象”,而不是“是一个类似文件的对象”。)


For 2.x, while the io module has existed since 2.6+, built-in file objects are not instances of io classes, neither are any of the file-like objects in the stdlib, and neither are most third-party file-like objects you're likely to encounter.对于 2.x,虽然io模块自 2.6+ 以来就存在,但内置文件对象不是io类的实例,也不是 stdlib 中的任何类文件对象,也不是大多数第三方类文件对象您可能会遇到的对象。 There was no official definition of what "file-like object" means; “类文件对象”的含义没有官方定义; it's just "something like a builtin file object ", and different functions mean different things by "like".它只是“类似于内置文件对象的东西”,不同的功能通过“like”意味着不同的东西。 Such functions should document what they mean;这些功能应该记录它们的含义; if they don't, you have to look at the code.如果他们不这样做,您必须查看代码。

However, the most common meanings are "has read(size) ", "has read() ", or "is an iterable of strings", but some old libraries may expect readline instead of one of those, some libraries like to close() files you give them, some will expect that if fileno is present then other functionality is available, etc. And similarly for write(buf) (although there are a lot fewer options in that direction).然而,最常见的含义是“已read(size) ”、“已read() ”或“是一个可迭代的字符串”,但一些旧库可能期望readline而不是其中之一,有些库喜欢close()文件,有些人会期望如果fileno存在,则其他功能可用,等等。类似的write(buf) (尽管在这个方向上的选项要少得多)。

As others have said you should generally avoid such checks.正如其他人所说,您通常应该避免此类检查。 One exception is when the object might legitimately be different types and you want different behaviour depending on the type.一个例外是当对象可能合法地是不同的类型并且您希望根据类型有不同的行为时。 The EAFP method doesn't always work here as an object could look like more than one type of duck! EAFP 方法在这里并不总是有效,因为一个对象可能看起来不止一种类型的鸭子!

For example an initialiser could take a file, string or instance of its own class.例如,初始化程序可以采用文件、字符串或它自己的类的实例。 You might then have code like:然后你可能有这样的代码:

class A(object):
    def __init__(self, f):
        if isinstance(f, A):
            # Just make a copy.
        elif isinstance(f, file):
            # initialise from the file
        else:
            # treat f as a string

Using EAFP here could cause all sorts of subtle problems as each initialisation path gets partially run before throwing an exception.在这里使用 EAFP 可能会导致各种微妙的问题,因为每个初始化路径在抛出异常之前都会部分运行。 Essentially this construction mimics function overloading and so isn't very Pythonic, but it can be useful if used with care.本质上,这种构造模仿了函数重载,因此不是很 Pythonic,但如果小心使用它会很有用。

As a side note, you can't do the file check in the same way in Python 3. You'll need something like isinstance(f, io.IOBase) instead.作为旁注,您不能在 Python 3 中以相同的方式进行文件检查。您需要使用isinstance(f, io.IOBase)类的东西。

It is generally not good practice to have checks like this in your code at all unless you have special requirements.除非您有特殊要求,否则在您的代码中进行这样的检查通常不是一个好习惯。

In Python the typing is dynamic, why do you feel need to check whether the object is file like, rather than just using it as if it was a file and handling the resulting error?在 Python 中,类型是动态的,为什么您觉得需要检查对象是否与文件类似,而不是像使用文件一样使用它并处理由此产生的错误?

Any check you can do is going to happen at runtime anyway so doing something like if not hasattr(fp, 'read') and raising some exception provides little more utility than just calling fp.read() and handling the resulting attribute error if the method does not exist.无论如何,您可以做的任何检查都将在运行时发生,因此执行if not hasattr(fp, 'read')并引发一些异常提供的实用程序比仅调用fp.read()并处理产生的属性错误(如果方法不存在。

The dominant paradigm here is EAFP: easier to ask forgiveness than permission.这里的主导范式是 EAFP:请求宽恕比许可更容易。 Go ahead and use the file interface, then handle the resulting exception, or let them propagate to the caller.继续使用文件接口,然后处理产生的异常,或者让它们传播给调用者。

It's often useful to raise an error by checking a condition, when that error normally wouldn't be raised until much later on.通过检查条件来引发错误通常很有用,因为该错误通常要到很久以后才会引发。 This is especially true for the boundary between 'user-land' and 'api' code.对于“user-land”和“api”代码之间的边界尤其如此。

You wouldn't place a metal detector at a police station on the exit door, you would place it at the entrance!你不会在警察局门口放一个金属探测器,你会把它放在入口处! If not checking a condition means an error might occur that could have been caught 100 lines earlier, or in a super-class instead of being raised in the subclass then I say there is nothing wrong with checking.如果不检查条件意味着可能会发生一个错误,该错误可能会在 100 行之前被捕获,或者在超类中而不是在子类中引发,那么我说检查没有任何问题。

Checking for proper types also makes sense when you are accepting more than one type.当您接受不止一种类型时,检查正确的类型也很有意义。 It's better to raise an exception that says "I require a subclass of basestring, OR file" than just raising an exception because some variable doesn't have a 'seek' method...最好引发一个异常,说“我需要一个基字符串的子类,或文件”,而不是仅仅引发一个异常,因为某些变量没有“寻求”方法......

This doesn't mean you go crazy and do this everywhere, for the most part I agree with the concept of exceptions raising themselves, but if you can make your API drastically clear, or avoid unnecessary code execution because a simple condition has not been met do so!这并不意味着你会发疯并到处这样做,在大多数情况下,我同意异常引发的概念,但是如果你能让你的 API 非常清晰,或者避免不必要的代码执行,因为一个简单的条件没有得到满足这样做!

You can try and call the method then catch the exception:您可以尝试调用该方法然后捕获异常:

try:
    fp.read()
except AttributeError:
    raise something

If you only want a read and a write method you could do this:如果你只想要一个读和写方法,你可以这样做:

if not (hasattr(fp, 'read') and hasattr(fp, 'write')):
   raise something

If I were you I would go with the try/except method.如果我是你,我会使用 try/except 方法。

I ended up running into your question when I was writing an open -like function that could accept a file name, file descriptor or pre-opened file-like object.当我编写一个类似open的函数可以接受文件名、文件描述符或预先打开的类文件对象时,我最终遇到了你的问题。

Rather than testing for a read method, as the other answers suggest, I ended up checking if the object can be opened.正如其他答案所暗示的那样,我没有测试read方法,而是检查了对象是否可以打开。 If it can, it's a string or descriptor, and I have a valid file-like object in hand from the result.如果可以,它是一个字符串或描述符,并且我手头有一个有效的类似文件的对象。 If open raises a TypeError , then the object is already a file.如果open引发TypeError ,则该对象已经是一个文件。

Under most circumstances, the best way to handle this is not to.在大多数情况下,处理这个问题的最好方法是不要。 If a method takes a file-like object, and it turns out the object it's passed isn't, the exception that gets raised when the method tries to use the object is not any less informative than any exception you might have raised explicitly.如果一个方法接受一个类似文件的对象,而事实证明它传递的对象不是,则该方法尝试使用该对象时引发的异常并不比您可能明确引发的任何异常信息少。

There's at least one case where you might want to do this kind of check, though, and that's when the object's not being immediately used by what you've passed it to, eg if it's being set in a class's constructor.但是,至少在一种情况下,您可能想要进行这种检查,那就是当对象没有立即被您传递给它的对象使用时,例如,如果它是在类的构造函数中设置的。 In that case, I would think that the principle of EAFP is trumped by the principle of "fail fast."在那种情况下,我会认为 EAFP 的原则被“快速失败”的原则压倒了。 I'd check the object to make sure it implemented the methods that my class needs (and that they're methods), eg:我会检查对象以确保它实现了我的类需要的方法(并且它们是方法),例如:

class C():
    def __init__(self, file):
        if type(getattr(file, 'read')) != type(self.__init__):
            raise AttributeError
        self.file = file

try this:试试这个:

import os 

if os.path.isfile(path):
   'do something'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM