简体   繁体   English

为什么许多Python内置/标准库函数实际上是类

[英]Why are many Python built-in/standard library functions actually classes

Many Python builtin "functions" are actually classes, although they also have a straightforward function implementation. 许多Python内置的“函数”实际上是类,尽管它们也有简单的函数实现。 Even very simple ones, such as itertools.repeat . 甚至非常简单,例如itertools.repeat What is the motivation for this? 这是什么动机? It seems like over-engineering to me. 这对我来说似乎过度工程化了。

Edit: I am not asking about the purpose of itertools.repeat or any other particular function. 编辑:我不是在询问itertools.repeat或任何其他特定函数的用途。 It was just an example of a very simple function with a very simple possible impementation: 它只是一个非常简单的函数示例,具有非常简单的可能的强制:

def repeat(x):
    while True: yield x

But itertools.repeat is not actually a function, it's implemented as a class. 但是itertools.repeat实际上并不是一个函数,它是作为一个类实现的。 My question is: Why? 我的问题是:为什么? It seems like unnecessary overhead. 这似乎是不必要的开销。

Also I understand that classes are callable functions and how you can emulate a function-like behavior using a class. 另外我理解类是可调用函数,以及如何使用类模拟类函数行为。 But I don't understand why it's so widely used through the standard library. 但我不明白为什么它通过标准库如此广泛使用。

Implementing as a class for itertools has some advantages that generator functions don't have. 作为itertools的类实现具有生成器函数所没有的一些优点。 For example: 例如:

  1. CPython implements these built-ins at the C layer, and at the C layer, a generator "function" is best implemented as a class implementing __next__ that preserves state as instance attributes; CPython在C层实现这些内置函数,在C层,生成器“函数”最好实现为实现__next__的类,它将状态保存为实例属性; yield based generators are a Python layer nicety, and really, they're just an instance of the generator class (so they're actually still class instances, like everything else in Python) 基于yield的生成器是Python层的精确,实际上,它们只是generator类的一个实例(所以它们实际上仍然是类实例,就像Python中的其他所有实例一样)
  2. Generators aren't pickleable or copyable, and don't have "story" for making them support either behavior (the internal state is too complex and opaque to generalize it); 生成器不是可拾取的或可复制的,并且没有“故事”使它们支持任何一种行为(内部状态太复杂和不透明而不能概括它); a class can define __reduce__ / __copy__ / __deepcopy__ (and if it's a Python level class, it probably doesn't even need to do that; it will work automatically) and make the instances pickleable/copyable (so if you have already generated 5 elements from a range iterator, you can copy or pickle/unpickle it, and get an iterator the same distance along in iteration) 一个类可以定义__reduce__ / __copy__ / __deepcopy__ (如果它是一个Python级别的类,它可能甚至不需要这样做;它会自动工作)并使实例可选择/可复制(所以如果你已经生成了5个元素从range迭代器,你可以复制或pickle / unpickle它,并在迭代中得到一个迭代器相同的距离)

For non-generator tools, the reasons are usually similar. 对于非发电机工具,原因通常是相似的。 Classes can be given state and customized behaviors that a function can't. 可以为类提供状态和定制的行为,而函数则不能。 They can be inherited from (if that's desired, but C layer classes can prohibit subclassing if they're "logically" functions). 它们可以从中继承(如果需要,但C层类可以禁止子类化,如果它们是“逻辑”函数)。

It's also useful for dynamic instance creation; 它对于动态实例创建也很有用; if you have an instance of an unknown class but a known prototype (say, the sequence constructors that take an iterable, or chain or whatever), and you want to convert some other type to that class, you can do type(unknown)(constructorarg) ; 如果你有一个未知类的实例但是一个已知的原型(比如,采用可迭代的,或chain或其他的序列构造函数),并且你想将其他类型转换为该类,你可以进行type(unknown)(constructorarg) ; if it's a generator, type(unknown) is useless, you can't use it to make more of itself because you can't introspect to figure out where it came from (not in reasonable ways). 如果它是一个生成器, type(unknown)是无用的,你不能用它来制造更多的东西,因为你无法想象它是从哪里来的(不是以合理的方式)。

And beyond that, even if you never use the features for programming logic, what would you rather see in the interactive interpreter or doing print debugging of type(myiter) , <class 'generator'> that gives no hints as to origin, or <class 'itertools.repeat'> that tells you exactly what you have and where it came from? 除此之外,即使您从未将这些功能用于编程逻辑,您更愿意在交互式解释器中看到什么或者进行type(myiter)<class 'generator'> type(myiter) <class 'generator'>打印调试,它不提供关于原点的提示,或者<class 'itertools.repeat'>告诉你你究竟拥有什么以及它来自哪里?

Both functions and classes are callables , so they can be used interchangeably in higher-order functions, for example. 函数和类都是可调用的 ,因此它们可以在高阶函数中互换使用。

$ python2
... 
>>> map(dict, [["ab"], ["cd"], ["ef"]])
[{'a': 'b'}, {'c': 'd'}, {'e': 'f'}]
>>> map(lambda x: dict(x), [["ab"], ["cd"], ["ef"]])
[{'a': 'b'}, {'c': 'd'}, {'e': 'f'}]

That said, classes can also define methods that you can later call on the returned objects. 也就是说,类也可以定义稍后可以在返回的对象上调用的方法 For instance, the dict class defines the .get() method for dictionaries, etc. 例如, dict类为字典等定义.get()方法。

In the case of itertools.repeat (and most iterators), using a proper class implementing the iterator protocol has a few advantages from the implementation / maintenance POV - like you can have better control of the iteration, you can specialize the class etc. I also suspect that there are some optimisations that can be done at C-level for proper iterators that don't apply to generators. itertools.repeat (和大多数迭代器)的情况下,使用实现iterator协议的适当类具有实现/维护POV的一些优点 - 就像您可以更好地控制迭代,您可以专门化类等。还怀疑有一些优化可以在C级完成,适用于不适用于生成器的迭代器。

Also remember that classes and functions are objects too - the def statement is mostly syntactic sugar for creating a function instance and populating it with compiled code, local namespace, cells, closures and whatnots (a somehow involved task FWIW, I did once just for out of curiousity and it was a major PITA), and the class statement is also syntactic sugar for creating a new type instance (doing it manually happens to be really trivial actually). 还记得类和函数也是对象 - def语句主要是用于创建function实例的语法糖,并用编译的代码,本地命名空间,单元格,闭包和诸如此类的东西填充它(以某种方式涉及任务FWIW,我曾做过一次只是为了好奇,它是一个主要的PITA),并且class声明也是用于创建新type实例的语法糖(实际上手动执行它实际上是非常微不足道的)。 From this POV, yield is a similar syntactic sugar that turns your function into a factory returning instances of the generic generator builtin type - IOW it makes your function act like a class, without the hassle of writing a full-blown class but also without the fine control and possible optimisations you can get by writing a full-blown class. 从这个POV, yield是一个类似的语法糖,它将你的函数变成一个工厂返回泛型generator内置类型的实例 - IOW它使你的函数就像一个类,没有编写一个完整的类的麻烦,但也没有通过编写一个完整的课程,您可以获得良好的控制和可能的优化。

On a more general leval, sometimes writing your "function" as a custom callable type instead offers similar gains - fine control, possible optimisations, and well sometimes just better readability (think of two-steps decorators, custom descriptors etc). 在更通用的leval上,有时将您的“函数”编写为自定义可调用类型,而不是提供类似的收益 - 精细控制,可能的优化,以及有时更好的可读性(想想两步装饰器,自定义描述符等)。

Finally wrt/ builtin types ( int , str etc) IIRC (please someone correct me if i'm wrong) they originally were functions acting as factory functions (before the new-style classes revolution when builtin types and user-defined types were different kind of objects). 最后wrt / builtin类型( intstr等)IIRC(请有人纠正我,如果我错了)他们最初是作为工厂函数的函数(在内置类型和用户定义类型不同类型的新式类革命之前对象)。 It of course makes sense to have them as plain classes now, but they had to keep the all_lower naming scheme for compatibility. 当然现在将它们作为普通类是有意义的,但是它们必须保持all_lower命名方案以实现兼容性。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM