简体   繁体   English

尝试 select 功能时代码进入无限循环

[英]Code enters infinite loop when trying to select features

I am trying to use scikit learn's Recursive feature elimination with cross-validation for a (5000, 37) data that has binary class problem and whenever i fit the model the algorithm enters infinite loop.我正在尝试使用 scikit learn 的递归特征消除和交叉验证来处理具有二进制 class 问题的(5000, 37)数据,并且只要我适合 model,算法就会进入无限循环。 Currently, i am following this example: https://scikit-learn.org/stable/auto_examples/feature_selection/plot_rfe_with_cross_validation.html on how to employ this algorithm.目前,我正在关注这个例子: https://scikit-learn.org/stable/auto_examples/feature_selection/plot_rfe_with_cross_validation.html关于如何使用这个算法。

My data is:我的数据是:

    from sklearn.svm import SVC
    from sklearn.model_selection import StratifiedKFold
    from sklearn.feature_selection import RFECV
    
        X = np.random.randint(0,363175645.191632,size=(5000, 37))
        Y = np.random.choice([0, 1], size=(37,))

What i tried doing to select the features by:我尝试通过以下方式对 select 的功能做些什么:

    svc = SVC(kernel="linear")
    rfecv = RFECV(estimator=svc, step=1, cv=StratifiedKFold(2),
                  scoring='accuracy')
    
    rfecv.fit(X, Y)

The code hangs and enters infinite loop, however when i try using another algorithm such as ExtraTreesClassifier it works just fine, what is going on, please help?代码挂起并进入无限循环,但是当我尝试使用另一种算法(如 ExtraTreesClassifier)时,它工作得很好,这是怎么回事,请帮忙?

When you perform svm, because it is distance based, it makes sense to scale your feature variables, especially in your case when they are huge.当你执行 svm 时,因为它是基于距离的,所以缩放你的特征变量是有意义的,尤其是当它们很大的时候。 you can also check out this intro to svm .您还可以查看此 svm 简介 Using an example dataset:使用示例数据集:

from sklearn.datasets import make_blobs
import seaborn as sns
import numpy as np
from sklearn.preprocessing import StandardScaler

Scaler =  StandardScaler()

X, y = make_blobs(n_samples=5000, centers=3, shuffle=False,random_state=42)
X = np.concatenate((X,np.random.randint(0,363175645.191632,size=(5000,35))),axis=1)
y = (y==1).astype('int')

X_scaled = Scaler.fit_transform(X)

This dataset has only 2 useful variables in the first two columns, as you can see from the plot:这个数据集的前两列只有 2 个有用的变量,从 plot 可以看出:

plt.scatter(x=X_scaled[:,0],y=X_scaled[:,1],c=['k' if i else 'b' for i in y])

在此处输入图像描述

Now we run rfe on scaled data and we can see it returns the first two columns as top variables:现在我们在缩放数据上运行 rfe,我们可以看到它返回前两列作为顶部变量:

from sklearn.svm import SVC
from sklearn.model_selection import StratifiedKFold
from sklearn.feature_selection import RFECV

svc = SVC(kernel="linear")
rfecv = RFECV(estimator=svc, step=1, cv=StratifiedKFold(2),scoring='accuracy')
rfecv.fit(X_scaled, y)

rfecv.ranking_

array([ 1,  2, 17, 28, 33, 22, 23, 26,  6, 19, 20,  4, 10, 25,  3, 27, 11,
        8, 18,  5, 29, 14,  7, 21,  9, 13, 24, 30, 35, 31, 32, 34, 16, 36,
       37, 12, 15])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Python在读取行时进入无限循环 - Python while enters an infinite loop when reading lines PySerail Readline进入无限循环 - PySerail Readline enters infinite loop 当用户输入“exit”时退出无限while循环(python),但如果用户输入一个数字,则将其与50进行比较 - Exit infinite while loop (python) when user enters "exit", but if users enters a number compare it to 50 jit-lock-function在劣等的Python中进入无限循环 - jit-lock-function enters infinite loop in inferior Python 尝试转换基数时如何创建无限循环 - How did I create an infinite loop when trying to convert bases python 3:列表的值在进入for循环时发生变化 - python 3 : values of a list change when it enters a for loop 为什么我的代码在抓取时陷入无限循环? - Why is my code stuck in an infinite loop when scraping? 在Django视图代码中有无限循环时会发生什么? - What happens when you have an infinite loop in Django view code? 如果用户不输入任何内容,则尝试使函数循环并中断 - Trying to make Function loop and break if user enters nothing 当试图打破 Python 3 中的“接收”循环时,服务器套接字应用程序变成无限循环 - Server socket application turns into infinite loop when trying to break the “receive” loop in Python 3
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM