How to get the leader feature importances in h2o automl pysparkling water

Question

i am using spark standalone cluster and running h2o pysparkling in it. I am unable to find the function for getting the leader feature importances. please help

Code:

import pandas as pd
from pyspark.sql import SparkSession
from pysparkling import *
import h2o
from pyspark import SparkFiles
from pysparkling.ml import H2OAutoML
spark = SparkSession.builder.appName('SparkApplication').getOrCreate()

conf = H2OConf()
hc = H2OContext.getOrCreate(conf)

def xgb_automl_features_importance(data, target_metric):
    # Converting DataFrame in H2OFrame
    hf = h2o.H2OFrame(data)
    sparkDF = hc.asSparkFrame(hf)
    # Identify predictors and response
    y = target_metric
    aml = H2OAutoML(labelCol=y)
    aml.setIncludeAlgos(["XGBoost"])
    aml.setMaxModels(1)
    aml.fit(sparkDF)
    print('-----------****************')
    print(aml.getLeaderboard().show(truncate=False))

Answer 1

The fit method on H2OAutoML returns the leader model. Each model in SW has the method getFeatureImportances() returning Spark data frame with feature importances.

model=aml.fit(sparkDF)
model.getFeatureImportances().show()

How to get the leader feature importances in h2o automl pysparkling water

Question

1 answers

solution1
0 2022-08-11 13:26:34

How to get the leader feature importances in h2o automl pysparkling water

Question

1 answers

solution1 0 2022-08-11 13:26:34

solution1
0 2022-08-11 13:26:34