简体   繁体   English

class 中用于特征选择的 function 的说明

[英]Explainantion of a function used in class for feature selection

I came across a function which goes as follows:我遇到了一个 function 如下:

def indices_of_top_k(arr, k):
    return np.sort(np.argpartition(np.array(arr), -k)[-k:])

I am not able to understand what it does or how each of its components work.我无法理解它的作用或每个组件的工作原理。 Could someone please give an explanation for what it does?有人可以解释一下它的作用吗?

For context it is used in the class given below for feature selection:对于上下文,它在下面给出的 class 中用于特征选择:

class TopFeatureSelector(BaseEstimator, TransformerMixin):
    def __init__(self, feature_importances, k):
        self.feature_importances = feature_importances
        self.k = k
    def fit(self, X, y=None):
        self.feature_indices_ = indices_of_top_k(self.feature_importances, self.k)
        return self
    def transform(self, X):
        return X[:, self.feature_indices_]

Thanks,谢谢,

partition can be harder to understand than sort. partition比排序更难理解。 Think of it as an incomplete sort.将其视为不完整的排序。

In [152]: x=np.random.randint(0,50,12)
In [153]: x
Out[153]: array([16, 16,  4, 33, 39, 43, 28, 47,  2, 23, 25, 11])

To get the largest 5 elements, we can sort, and slice:要获得最大的 5 个元素,我们可以排序和切片:

In [154]: np.sort(x)[-5:]
Out[154]: array([28, 33, 39, 43, 47])

partition gets the same values, but the order is a bit different: partition获得相同的值,但顺序有点不同:

In [155]: np.partition(x,-5)[-5:]
Out[155]: array([28, 33, 39, 47, 43])

The corresponding indices:对应的指标:

In [156]: np.argpartition(x,-5)[-5:]
Out[156]: array([6, 3, 4, 7, 5])

sorting those indices:对这些索引进行排序:

In [157]: np.sort(np.argpartition(x,-5)[-5:])
Out[157]: array([3, 4, 5, 6, 7])

Using argsort instead does the same thing, but supposedly argpartition is faster than argsort :使用 argsort 做同样的事情,但据说argpartitionargsort快:

In [158]: np.sort(np.argsort(x)[-5:])
Out[158]: array([3, 4, 5, 6, 7])

From this we can get the 5 largest values, but in their original order, as opposed to the sorted order in [154]:从这里我们可以得到 5 个最大值,但是按照它们的原始顺序,而不是 [154] 中的排序顺序:

In [159]: x[_]
Out[159]: array([33, 39, 43, 28, 47])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM