[英]How to get the leader feature importances in h2o automl pysparkling water
i am using spark standalone cluster and running h2o pysparkling in it.我正在使用 spark 独立集群并在其中运行 h2o pysparkling。 I am unable to find the function for getting the leader feature importances.
我找不到 function 来获取领导者特征的重要性。 please help
请帮忙
Code:代码:
import pandas as pd
from pyspark.sql import SparkSession
from pysparkling import *
import h2o
from pyspark import SparkFiles
from pysparkling.ml import H2OAutoML
spark = SparkSession.builder.appName('SparkApplication').getOrCreate()
conf = H2OConf()
hc = H2OContext.getOrCreate(conf)
def xgb_automl_features_importance(data, target_metric):
# Converting DataFrame in H2OFrame
hf = h2o.H2OFrame(data)
sparkDF = hc.asSparkFrame(hf)
# Identify predictors and response
y = target_metric
aml = H2OAutoML(labelCol=y)
aml.setIncludeAlgos(["XGBoost"])
aml.setMaxModels(1)
aml.fit(sparkDF)
print('-----------****************')
print(aml.getLeaderboard().show(truncate=False))
The fit method on H2OAutoML
returns the leader model. H2OAutoML
上的 fit 方法返回领导者 model。 Each model in SW has the method getFeatureImportances()
returning Spark data frame with feature importances. SW 中的每个 model 都有方法
getFeatureImportances()
返回具有特征重要性的 Spark 数据帧。
model=aml.fit(sparkDF)
model.getFeatureImportances().show()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.