简体   繁体   English

如何在 h2o automl pysparkling water 中获得领导者特征的重要性

[英]How to get the leader feature importances in h2o automl pysparkling water

i am using spark standalone cluster and running h2o pysparkling in it.我正在使用 spark 独立集群并在其中运行 h2o pysparkling。 I am unable to find the function for getting the leader feature importances.我找不到 function 来获取领导者特征的重要性。 please help请帮忙

Code:代码:

import pandas as pd
from pyspark.sql import SparkSession
from pysparkling import *
import h2o
from pyspark import SparkFiles
from pysparkling.ml import H2OAutoML
spark = SparkSession.builder.appName('SparkApplication').getOrCreate()

conf = H2OConf()
hc = H2OContext.getOrCreate(conf)

def xgb_automl_features_importance(data, target_metric):
    # Converting DataFrame in H2OFrame
    hf = h2o.H2OFrame(data)
    sparkDF = hc.asSparkFrame(hf)
    # Identify predictors and response
    y = target_metric
    aml = H2OAutoML(labelCol=y)
    aml.setIncludeAlgos(["XGBoost"])
    aml.setMaxModels(1)
    aml.fit(sparkDF)
    print('-----------****************')
    print(aml.getLeaderboard().show(truncate=False))

The fit method on H2OAutoML returns the leader model. H2OAutoML上的 fit 方法返回领导者 model。 Each model in SW has the method getFeatureImportances() returning Spark data frame with feature importances. SW 中的每个 model 都有方法getFeatureImportances()返回具有特征重要性的 Spark 数据帧。

model=aml.fit(sparkDF)
model.getFeatureImportances().show()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM