python sklearn GradientBoostingClassifier熱啟動錯誤

Question

我已經使用該模型在1000個迭代的一組數據上訓練分類器：

clf = GradientBoostingClassifier(n_estimators=1000, learning_rate=0.05, subsample=0.1, max_depth=3)
clf.fit(X, y, sample_weight=train_weight)

現在我想將迭代次數增加到2000.所以我這樣做：

clf.set_params(n_estimators=2000, warm_start=True)
clf.fit(X, y, sample_weight=train_weight)

但是我收到以下錯誤：

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-13-49cfdfd6c024> in <module>()

      1 start = time.clock()
      2 clf.set_params(n_estimators=2000, warm_start=True)
----> 3 clf.fit(X, y, sample_weight=train_weight)
      4 ...

C:\Anaconda3\lib\site-packages\sklearn\ensemble\gradient_boosting.py in fit(self, X, y, sample_weight, monitor)
   1002                                     self.estimators_.shape[0]))
   1003             begin_at_stage = self.estimators_.shape[0]
-> 1004             y_pred = self._decision_function(X)
   1005             self._resize_state()
   1006 

C:\Anaconda3\lib\site-packages\sklearn\ensemble\gradient_boosting.py in _decision_function(self, X)
   1120         # not doing input validation.
   1121         score = self._init_decision_function(X)
-> 1122         predict_stages(self.estimators_, X, self.learning_rate, score)
   1123         return score
   1124 

sklearn/ensemble/_gradient_boosting.pyx in sklearn.ensemble._gradient_boosting.predict_stages (sklearn\ensemble\_gradient_boosting.c:2564)()

ValueError: ndarray is not C-contiguous

我在這做錯了什么？

Answer 1

warm_start正在正常使用。 實際上有一個錯誤阻止了它的工作。

同時解決方法是將數組復制到C連續數組：

X_train = np.copy(X_train, order='C')
X_test = np.copy(X_test, order='C')

參考：討論和錯誤

Answer 2

您通常無法在擬合調用之間修改sklearn分類器並期望它可以工作。 估計量的數量實際上會影響模型內部對象的大小 - 因此它不僅僅是一些迭代（從編程的角度來看）。

Answer 3

在我看來，問題是你沒有將warm_start = True傳遞給構造函數。 如果你這樣做：

clf = GradientBoostingClassifier(n_estimators=1000, learning_rate=0.05, subsample=0.1, max_depth=3, warm_start=True)

您將能夠使用以下內容來擬合其他估算器：

clf.set_params(n_estimators=2000)
clf.fit(X, y, sample_weight=train_weight)

如果它不起作用，您應該嘗試更新您的sklearn版本。

python sklearn GradientBoostingClassifier熱啟動錯誤

問題描述

3 個解決方案

解決方案1
2 2016-10-17 05:00:53

解決方案2
0 已采納 2016-03-02 23:28:17

解決方案3
0 2016-10-18 16:07:45

python sklearn GradientBoostingClassifier熱啟動錯誤

問題描述

3 個解決方案

解決方案1 2 2016-10-17 05:00:53

解決方案2 0 已采納 2016-03-02 23:28:17

解決方案3 0 2016-10-18 16:07:45

解決方案1
2 2016-10-17 05:00:53

解決方案2
0 已采納 2016-03-02 23:28:17

解決方案3
0 2016-10-18 16:07:45