简体   繁体   中英

Scikit learn ComplementNB is outputting NaN for scores

I have an unbalanced binary dataset with 23 features, 92000 rows are labeled 0, and 207,000 rows are labeled 1.

I trained models on this dataset such as GaussianNB, DecisionTreeClassifier, and a few more classifiers from scikit learn, and they all work fine.

I want to run ComplementNB on this dataset, but when i do so, all the scores are coming out as NaN.

Below is my code:

from sklearn.naive_bayes import ComplementNB
features = [
            # Chest accelerometer sensor
            'chest_accel_x', 'chest_accel_y', 'chest_accel_z',
    
            # ECG (2 leads)
            'ecg_1', 'ecg_2',

            # Left ankle sensors
            'left_accel_x', 'left_accel_y', 'left_accel_z',
            'left_gyro_x', 'left_gyro_y', 'left_gyro_z',
            'left_mag_x', 'left_mag_y', 'left_mag_z',

            # Right lower arm sensors
            'right_accel_x', 'right_accel_y', 'right_accel_z',
            'right_gyro_x', 'right_gyro_y', 'right_gyro_z',
            'right_mag_x', 'right_mag_y', 'right_mag_z',
        ]
df = pd.read_csv('mhealth_s_m.csv')
X = df[features]
y = df['label']
train_X, test_X, train_y, test_y = train_test_split(X, y, test_size = 0.2, random_state = 69)
def K_fold_unbalanced(train_X, train_y):
        scoring = ['accuracy', 'f1', 'precision', 'recall', 'roc_auc']
        print('Unbalanced Data')
        model = ComplementNB()
        start_time = time.time()
        scores = cross_validate(model, train_X, train_y, scoring = scoring, cv = 5, return_train_score = True)
        print(scores)
        print('Took', time.time() - start_time, 'to run')
        print('=======================================')
K_fold_unbalanced(train_X, train_y)

Output is:

train accuracy nan 
 train f1 nan 
 train precision nan 
 train recall nan 
 train roc auc nan

test accuracy nan 
 test f1 nan 
 test precision nan 
 test recall nan 
 test roc auc nan
Took 0.12271976470947266 to run

Any ideas why all the values are NaN? My data can be found here

this fixed it:

from sklearn.preprocessing import MinMaxScale

scaler = MinMaxScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM