简体   繁体   English

如何在 Windows 上的 python 中安装 XGBoost 包

[英]How can I install XGBoost package in python on Windows

I tried to install XGBoost package in python.我尝试在 python 中安装 XGBoost 包。 I am using windows os, 64bits .我正在使用 windows os, 64bits 。 I have gone through following.我经历了以下。

The package directory states that xgboost is unstable for windows and is disabled: pip installation on windows is currently disabled for further invesigation, please install from github.包目录指出xgboost在windows下不稳定,已禁用:windows上pip安装目前已禁用以进一步调查,请从github安装。 https://pypi.python.org/pypi/xgboost/ https://pypi.python.org/pypi/xgboost/

I am not well versed in Visual Studio, facing problem building XGBoost.我不太精通 Visual Studio,在构建 XGBoost 时遇到了问题。 I am missing opportunities to utilize xgboost package in data science.我错过了在数据科学中使用 xgboost 包的机会。

Please guide, so that I can import the XGBoost package in python.请指导,以便我可以在python中导入XGBoost包。

Thanks谢谢

If you are using anaconda (or miniconda ) you can use the following:如果您使用anaconda (或miniconda ),您可以使用以下内容:

  • conda install -c anaconda py-xgboost UPDATED 2019-09-20 conda install -c anaconda py-xgboost更新 2019-09-20
  • Docs文档

Check install by:检查安装:

  • Activating the environment (see below)激活环境(见下文)
  • Running conda list运行conda list

To activate an environment : 激活环境

On Windows, in your Anaconda Prompt, run (assumes your environment is named myenv ):在 Windows 上,在您的 Anaconda Prompt 中,运行(假设您的环境名为myenv ):

  • activate myenv

On macOS and Linux, in your Terminal Window, run (assumes your environment is named myenv ):在 macOS 和 Linux 上,在您的终端窗口中,运行(假设您的环境名为myenv ):

  • source activate myenv

Conda prepends the path name myenv onto your system command. Conda 将路径名 myenv 附加到您的系统命令中。

Build it from here:从这里构建它:

  • download xgboost whl file from here (make sure to match your python version and system architecture, eg "xgboost-0.6-cp35-cp35m-win_amd64.whl" for python 3.5 on 64-bit machine)这里下载 xgboost whl 文件(确保匹配您的 python 版本和系统架构,例如“xgboost-0.6-cp35-cp35m-win_amd64.whl”用于 64 位机器上的 python 3.5)
  • open command prompt打开命令提示符
  • cd to your Downloads folder (or wherever you saved the whl file) pip install xgboost-0.6-cp35-cp35m-win_amd64.whl (or whatever your whl file is named) cd 到您的下载文件夹(或您保存 whl 文件的任何位置) pip install xgboost-0.6-cp35-cp35m-win_amd64.whl (或您的 whl 文件的名称)

You first need to build the library through "make", then you can install using anaconda prompt (if you want it on anaconda) or git bash (if you use it in Python only).您首先需要通过“make”构建库,然后您可以使用 anaconda prompt(如果您希望在 anaconda 上使用)或 git bash(如果您仅在 Python 中使用它)进行安装。

First follow the official guide with the following procedure (in Git Bash on Windows):首先按照以下步骤(在 Windows 上的 Git Bash 中) 按照官方指南进行操作:

git clone --recursive https://github.com/dmlc/xgboost
git submodule init
git submodule update

then install TDM-GCC here and do the following in Git Bash:然后在此处安装 TDM-GCC并在 Git Bash 中执行以下操作:

alias make='mingw32-make'
cp make/mingw64.mk config.mk; make -j4

Last, do the following using anaconda prompt or Git Bash:最后,使用 anaconda prompt 或 Git Bash 执行以下操作:

cd xgboost\python-package  
python setup.py install 

Also refer to these great resources:另请参阅这些重要资源:

Official Guide官方指南

Installing Xgboost on Windows 在 Windows 上安装 Xgboost

Installing XGBoost For Anaconda on Windows 在 Windows 上为 Anaconda 安装 XGBoost

You can pip install catboost.您可以 pip install catboost。 It is a recently open-sourced gradient boosting library, which is in most cases more accurate and faster than XGBoost, and it has categorical features support.它是最近开源的梯度提升库,在大多数情况下比 XGBoost 更准确、更快,并且具有分类特征支持。 Here is the site of the library: https://catboost.ai这是图书馆的网站: https : //catboost.ai

The following command should work but, If you have a problem with this command以下命令应该可以工作,但是,如果此命令有问题

conda install -c conda-forge xgboost conda install -c conda-forge xgboost

First activate your environment .首先激活您的环境。 Assume your environment is named simply write in conda terminal :假设您的环境命名为在 conda 终端中简单写入:

activate <MY_ENV>

and then进而

 pip install xgboost

在 macOS 上,以下命令可以运行 conda install -c conda-forge xgboost 但在此之前我已经阅读了一些其他文章,因此确实使用 brew 安装了 gcc

pip install xgboost也适用于python 3.8 ,而上面提到的其他选项对我不起作用

I have installed xgboost in windows os following the above resources, which is not available till now in pip. 按照上述资源,我已经在Windows操作系统中安装了xgboost,到目前为止,在pip中尚不可用。 However, I tried with the following function code, to get cv parameters tuned: 但是,我尝试使用以下功能代码来调整CV参数:

#Import libraries:
import pandas as pd
import numpy as np
import xgboost as xgb
from xgboost.sklearn import XGBClassifier
from sklearn import cross_validation, metrics   #Additional sklearn functions
from sklearn.grid_search import GridSearchCV   #Perforing grid search

import matplotlib.pylab as plt
%matplotlib inline
from matplotlib.pylab import rcParams
rcParams['figure.figsize'] = 12, 4

train = pd.read_csv('train_data.csv')
target = 'target_value'
IDcol = 'ID'

A function is created to get the optimum parameters and display the output in visual form. 创建一个函数以获取最佳参数并以可视形式显示输出。

def modelfit(alg, dtrain, predictors,useTrainCV=True, cv_folds=5, early_stopping_rounds=50):

if useTrainCV:
    xgb_param = alg.get_xgb_params()
    xgtrain = xgb.DMatrix(dtrain[predictors].values, label=dtrain[target].values)
    cvresult = xgb.cv(xgb_param, xgtrain, num_boost_round=alg.get_params()['n_estimators'], nfold=cv_folds,
        metrics='auc', early_stopping_rounds=early_stopping_rounds, show_progress=False)
    alg.set_params(n_estimators=cvresult.shape[0])

#Fit the algorithm on the data
alg.fit(dtrain[predictors], dtrain[target_label],eval_metric='auc')

#Predict training set:
dtrain_predictions = alg.predict(dtrain[predictors])
dtrain_predprob = alg.predict_proba(dtrain[predictors])[:,1]

#Print model report:
print "\nModel Report"
print "Accuracy : %.4g" % metrics.accuracy_score(dtrain[target_label].values, dtrain_predictions)
print "AUC Score (Train): %f" % metrics.roc_auc_score(dtrain[target_label], dtrain_predprob)

feat_imp = pd.Series(alg.booster().get_fscore()).sort_values(ascending=False)
feat_imp.plot(kind='bar', title='Feature Importances')
plt.ylabel('Feature Importance Score')

Now, when the function is called to get the optimum parameters: 现在,当调用函数以获取最佳参数时:

  #Choose all predictors except target & IDcols
  predictors = [x for x in train.columns if x not in [target]]
  xgb = XGBClassifier(
  learning_rate =0.1,
  n_estimators=1000,
  max_depth=5,
  min_child_weight=1,
  gamma=0,
  subsample=0.7,
  colsample_bytree=0.7,
  objective= 'binary:logistic',
  nthread=4,
  scale_pos_weight=1,
  seed=198)
 modelfit(xgb, train, predictors)

Although the feature importance chart is displayed, but the parameters info in red box at the top of chart is missing: 虽然显示了功能重要性图表,但是缺少图表顶部红色框中的参数信息: 在此处输入图片说明 Consulted people who use linux/mac OS and got xgboost installed. 咨询了使用linux / mac OS并安装xgboost的人员。 They are getting the above info. 他们正在获取以上信息。 I was wondering whether it is due to specific implementation , I build and installed in windows. 我想知道是否是由于特定的实现,所以我在Windows中构建并安装了它。 And how I can get the parameters info displayed above the chart. 以及如何获取显示在图表上方的参数信息。 As of now, I am getting the chart and not the red box and info within it. 到目前为止,我正在获取图表,而不是其中的红色框和信息。 Thanks. 谢谢。

Besides what's already on developers' github, which is building from source(creating a c++ environment, etc.), I have found an easier way to do it, which I explained here with details.除了开发人员的 github 上已有的内容(从源代码构建(创建 C++ 环境等))之外,我还找到了一种更简单的方法,我在此处详细解释该方法。 Basically, you have to go a website by UC Irvine and download a .whl file, then cd to the folder and install xgboost with pip.基本上,您必须访问 UC Irvine 的网站并下载 .whl 文件,然后 cd 到该文件夹​​并使用 pip 安装 xgboost。

XGBoost is used in Applied Machine Learning and is known for its gradient boost algorithm and it is available as a library in python but has to be compiled using cmake . XGBoost 用于应用机器学习,以其梯度提升算法而闻名,它可作为 Python 中的库使用,但必须使用cmake进行编译。

Alternatively what you can do is from this link you can download the C pre-compiled library and install it using the pip install < FILE-NAME.whl> command.或者,您可以通过此链接下载 C 预编译库并使用pip install < FILE-NAME.whl>命令安装它。 Ensure you have downloaded the library which is compatible with your python version.确保您已下载与您的 Python 版本兼容的库。

I experienced this problem while I was using the same in Anaconda(Spyder).我在 Anaconda(Spyder) 中使用相同的问题时遇到了这个问题。 Then just restart the kernel and your error will go away.然后只需重新启动内核,您的错误就会消失。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM