[英]Scikit learn ComplementNB is outputting NaN for scores
I have an unbalanced binary dataset with 23 features, 92000 rows are labeled 0, and 207,000 rows are labeled 1.我有一个包含 23 个特征的不平衡二进制数据集,92000 行标记为 0,207,000 行标记为 1。
I trained models on this dataset such as GaussianNB, DecisionTreeClassifier, and a few more classifiers from scikit learn, and they all work fine.我在这个数据集上训练了模型,例如 GaussianNB、DecisionTreeClassifier,以及来自 scikit learn 的其他几个分类器,它们都运行良好。
I want to run ComplementNB on this dataset, but when i do so, all the scores are coming out as NaN.我想在这个数据集上运行 ComplementNB,但是当我这样做时,所有的分数都以 NaN 的形式出现。
Below is my code:下面是我的代码:
from sklearn.naive_bayes import ComplementNB
features = [
# Chest accelerometer sensor
'chest_accel_x', 'chest_accel_y', 'chest_accel_z',
# ECG (2 leads)
'ecg_1', 'ecg_2',
# Left ankle sensors
'left_accel_x', 'left_accel_y', 'left_accel_z',
'left_gyro_x', 'left_gyro_y', 'left_gyro_z',
'left_mag_x', 'left_mag_y', 'left_mag_z',
# Right lower arm sensors
'right_accel_x', 'right_accel_y', 'right_accel_z',
'right_gyro_x', 'right_gyro_y', 'right_gyro_z',
'right_mag_x', 'right_mag_y', 'right_mag_z',
]
df = pd.read_csv('mhealth_s_m.csv')
X = df[features]
y = df['label']
train_X, test_X, train_y, test_y = train_test_split(X, y, test_size = 0.2, random_state = 69)
def K_fold_unbalanced(train_X, train_y):
scoring = ['accuracy', 'f1', 'precision', 'recall', 'roc_auc']
print('Unbalanced Data')
model = ComplementNB()
start_time = time.time()
scores = cross_validate(model, train_X, train_y, scoring = scoring, cv = 5, return_train_score = True)
print(scores)
print('Took', time.time() - start_time, 'to run')
print('=======================================')
K_fold_unbalanced(train_X, train_y)
Output is: Output 是:
train accuracy nan
train f1 nan
train precision nan
train recall nan
train roc auc nan
test accuracy nan
test f1 nan
test precision nan
test recall nan
test roc auc nan
Took 0.12271976470947266 to run
Any ideas why all the values are NaN?知道为什么所有值都是 NaN 吗? My data can be found here我的数据可以在这里找到
this fixed it:这修复了它:
from sklearn.preprocessing import MinMaxScale
scaler = MinMaxScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.