简体   繁体   English

哪个SKLearn接口定义了.fit,.predict等

[英]Which SKLearn interface defines .fit, .predict etc

Examining the sklearn.base , more specifically BaseEstimator , and the different mixins, it is obvious that some of the mixins are dependent on the the ability to call .fit or .predict . 检查sklearn.base ,更具体BaseEstimator ,和不同的混入,很明显,一些混入的依赖于呼叫的能力.fit.predict

For example, if we'd look at the RegressorMixin we'd see it relies on the .predict method. 举例来说,如果我们想看看RegressorMixin我们会看到它依赖于.predict方法。

My question is why is there no implementation of an interface / abstract class that enforces the implementation of these methods? 我的问题是为什么没有实现接口/抽象类来强制执行这些方法?

I'd expect to have something like BaseRegressor that has .predict() as an abstract method and BaseClassifier to have .predict_proba() and .predict() - or something similar 我希望像BaseRegressor这样的东西有BaseRegressor .predict()作为抽象方法, BaseClassifier.predict_proba().predict() - 或类似的东西

There are a few things which together make it probably more clear why things are done in a package like scikit-learn , the way they are: 有一些东西可以让人们更清楚为什么在像scikit-learn这样的软件包中完成这些工作,它们的方式如下:

  • duck typing vs inheritance: you can find very long arguments about which one is a better approach, and while they both have their advantages and disadvantages, at the end of the day, it comes down to what people in a community are used to. duck typing vs inheritance:你可以找到关于哪一个是更好的方法的很长时间的争论,虽然它们都有它们的优点和缺点,但最终归结为社区中的人习惯。 As somebody who does a lot of Python these days, I love duck typing, and I'm very comfortable with it. 作为现在做很多Python的人,我喜欢鸭子打字,我对它很满意。 At the same time, 15 years ago, I loved abstract classes and OOP and what not, and I wouldn't understand why you would do things any other way. 与此同时,15年前,我喜欢抽象课程和OOP而不喜欢,我不明白为什么你会以任何其他方式做事。 What I'm trying to say, is that people in Python like duck typing and that's partly why you see the pattern very often in some of its core packages. 我想说的是,Python中的人喜欢鸭子打字,这也是你在某些核心软件包中经常看到这种模式的部分原因。

  • duck typing, contrib packages and extentions: sometimes checking an input, we can either check its type, or duck type it for a certain functionality. duck typing,contrib package和extentions:有时检查输入,我们可以检查它的类型,或者为某个功能选择它。 If we check the type, that means any input to that method should actually inherit from those classes, whereas if you duck type them, they can simply be implementing those methods and they're fine. 如果我们检查类型,那意味着该方法的任何输入实际上应该从这些类继承,而如果你躲避它们,他们可以简单地实现这些方法,并且它们没问题。 This is important because if a developer is writing an estimator outside scikit-learn , for instance, which they want to be compatible with certain parts of scikit-learn , they don't have to depend on scikit-learn as a dependency (because that's how they can then inherit a certain class from the package), and simply implement those methods. 这很重要,因为如果开发人员在scikit-learn之外编写估算器,他们希望与scikit-learn某些部分兼容,那么他​​们就不必依赖scikit-learn作为依赖(因为那是然后他们如何从包中继承某个类,并简单地实现这些方法。 If developers have the constraints to keep their package and their dependencies lightweight, this becomes relevant (and we have seen these exact issues in scikit-learn ). 如果开发人员有限制来保持他们的包及其依赖项的轻量级,那么这就变得相关了(我们已经在scikit-learn看到了这些确切的问题)。

  • Mixin classes: the idea behind the Mixin classes is not really that the child classes should inherit them and implement their methods; Mixin类:在背后的想法Mixin类是不是真的那么子类应该继承他们和实施他们的方法; but it's more about adding a functionality to existing classes through them without having to copy/paste or reimplement any method. 但它更多的是通过它们向现有类添加功能,而无需复制/粘贴或重新实现任何方法。 For instance, the TransformerMixin adds the fit_transform method to an object, assuming it already has fit and transform , without caring about weather the object is an estimator or a transformer. 例如, TransformerMixinfit_transform方法添加到一个对象,假设它已经fittransform ,而不关心天气,该对象是估计器或变换器。 Again, you could argue that a certain design pattern from OOP may be better here, but that's a never ending argument, and this approach works, and the developers are comfortable with it. 同样,你可以说OOP的某种设计模式在这里可能会更好 ,但这是一个永无止境的论点,这种方法有效,开发人员也很满意。

The common idiom in python is 'duck typing' - if it behaves like a duck it's a duck, if it implements fit or any other relevant function it's a model for sklearn python中常见的习语是“鸭子打字” - 如果它表现得像鸭子那么它是鸭子,如果它实现fit或任何其他相关功能它是sklearn的模型

there's also the concept of abstract base classes, but it's usage is less common 还有抽象基类的概念,但它的用法不太常见

see more here: https://en.wikipedia.org/wiki/Duck_typing 在这里查看更多: https//en.wikipedia.org/wiki/Duck_typing

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM