简体   繁体   中英

How to get the leader feature importances in h2o automl pysparkling water

i am using spark standalone cluster and running h2o pysparkling in it. I am unable to find the function for getting the leader feature importances. please help

Code:

import pandas as pd
from pyspark.sql import SparkSession
from pysparkling import *
import h2o
from pyspark import SparkFiles
from pysparkling.ml import H2OAutoML
spark = SparkSession.builder.appName('SparkApplication').getOrCreate()

conf = H2OConf()
hc = H2OContext.getOrCreate(conf)

def xgb_automl_features_importance(data, target_metric):
    # Converting DataFrame in H2OFrame
    hf = h2o.H2OFrame(data)
    sparkDF = hc.asSparkFrame(hf)
    # Identify predictors and response
    y = target_metric
    aml = H2OAutoML(labelCol=y)
    aml.setIncludeAlgos(["XGBoost"])
    aml.setMaxModels(1)
    aml.fit(sparkDF)
    print('-----------****************')
    print(aml.getLeaderboard().show(truncate=False))

The fit method on H2OAutoML returns the leader model. Each model in SW has the method getFeatureImportances() returning Spark data frame with feature importances.

model=aml.fit(sparkDF)
model.getFeatureImportances().show()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM