catboost在玩具数据集上显示非常糟糕的结果

Question

Today I've tried to test an amazing Catboost library published recently by Yandex but it shows very poor results even on a toy dataset. 今天，我已经尝试测试Yandex最近发布的惊人的Catboost库，但是即使在玩具数据集上，它也显示出非常差的结果。 I've tried to find a root of my problem but due to the lack of proper documentation and topics about the library I can't figure out what's going on. 我试图找到问题的根源，但是由于缺乏有关该库的适当文档和主题，我无法弄清发生了什么。 Please help me =) I'm using Anaconda 3 x64 with Python 3.6. 请帮助我=）我正在将Anaconda 3 x64与Python 3.6配合使用。

from sklearn.datasets import make_classification
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.metrics import roc_auc_score, roc_curve, f1_score, make_scorer
from catboost import CatBoostClassifier

X,y = make_classification( n_classes=2
                              ,n_clusters_per_class=2
                              ,n_features=10
                              ,n_informative=4
                              ,n_repeated=2
                              ,shuffle=True
                              ,random_state=564
                              ,n_samples=10000
                                 )

X_train,X_test,y_train,y_test = train_test_split(X,y,train_size = 0.8)

cb = CatBoostClassifier(depth=3,custom_loss=
                            ['Accuracy','AUC'],
                            logging_level='Silent',
                            iterations=500,
                            od_type='Iter',
                            od_wait=20)
cb.fit(X_train,y_train,eval_set=(X_test,y_test),plot=True,use_best_model=True)
pred = cb.predict_proba(X_test)[:,1]
tpr,fpr,_=roc_curve(y_score=pred,y_true=y_test)
    #just to show the difference
from sklearn.ensemble import GradientBoostingClassifier
gbc = GradientBoostingClassifier().fit(X_train,y_train)
pred_gbc = gbc.predict_proba(X_test)[:,1]
tpr_xgb,fpr_xgb,_=roc_curve(y_score=pred_gbc,y_true=y_test)
plt.plot(tpr,fpr,color='orange')
plt.plot(tpr_xgb,fpr_xgb,color='red')
plt.show()

Answer 1

It was a bug. 这是一个错误。 Be careful and ensure you are using the latest version. 请注意并确保您使用的是最新版本。 The bug was fixed in 0.6.1 version. 该错误已在0.6.1版本中修复。

catboost在玩具数据集上显示非常糟糕的结果

问题描述

1 个解决方案

解决方案1
0 2018-02-05 16:03:52

catboost在玩具数据集上显示非常糟糕的结果

问题描述

1 个解决方案

解决方案1 0 2018-02-05 16:03:52

解决方案1
0 2018-02-05 16:03:52