H2O：在參數“模型”的函數“預測”中找不到深度學習對象

Question

我只是在測試 h2o，特別是它的深度學習能力，因為我聽說過關於它的好消息。 到目前為止，我一直在使用以下代碼：

     library(h2o)
library(caret)
data("iris")

# Initiate H2O --------------------
h2o.removeAll() # Clean up. Just in case H2O was already running
h2o.init(nthreads = -1, max_mem_size="22G")  # Start an H2O cluster with all threads available

# Get training and tournament data -------------------
a <- createDataPartition(iris$Species, list=FALSE)
training <- iris[a,]
test <- iris[-a,]

# Convert target to factor -------------------
target <- as.factor(iris$Species)

feature_names <- names(train)[1:(ncol(train)-1)]

train_h2o <- as.h2o(train)
test_h2o <- as.h2o(test)

prob <- test[, "id", drop = FALSE]

model_dl <- h2o.deeplearning(x = feature_names, y = "target", training_frame = train_h2o, stopping_metric = "logloss")
h2o.logloss(model_dl)

pred_dl <- predict(model_dl, newdata = tourn_h2o)
prob <- cbind(prob, as.data.frame(pred_dl$p1, col.names = "dl"))
write.table(prob[, c("id", "dl")], paste0(model_dl@model_id, ".csv"), sep = ",", row.names = FALSE, col.names = c("id", "probability"))

相關部分實際上是最后一行，我收到以下錯誤：

Error in .h2o.doSafeREST(h2oRestApiVersion = h2oRestApiVersion, urlSuffix = page,  : 


ERROR MESSAGE:

Object 'DeepLearning_model_R_1494350691427_70' not found in function: predict for argument: model

有沒有人遇到過這個？ 是否有任何我可能會遺漏的簡單解決方案？ 提前致謝。

編輯：使用更新后的代碼，我收到錯誤消息：

Error in .h2o.doSafeREST(h2oRestApiVersion = h2oRestApiVersion, urlSuffix = page,  : 


ERROR MESSAGE:

Illegal argument(s) for DeepLearning model: DeepLearning_model_R_1494428751150_1.  Details: ERRR on field: _train: Training data must have at least 2 features (incl. response).
ERRR on field: _stopping_metric: Stopping metric cannot be logloss for regression.

我認為這與讀取 Iris 數據集的方式有關。

Answer 1

第一個問題的答案：您的原始錯誤消息聽起來像是您在事情同步時可以獲得的消息。 例如，您可能同時運行兩個會話，並在一個會話中刪除模型； 另一個會話不會知道它的變量現在已經過時了。 H2O 允許多個連接，但它們必須是合作的。 （流程 - 見下一段 - 算作第二次會議。）

除非你能做出一個可重復的例子，否則聳聳肩，把它歸結為小鬼，然后開始一個新的會話。 或者，去看看 Flow 中的數據/模型（一個總是在 127.0.0.1:54321 上運行的網絡服務器），看看是否有東西不再存在。

對於您的 EDIT 問題，您的模型正在制作回歸模型，但您正在嘗試使用 logloss，因此認為您在進行分類。 這是由於沒有將目標變量設置為因子造成的。 您當前的as.factor()行位於錯誤的數據上，位於錯誤的位置。 它應該在你的as.h2o()行之后：

train_h2o <- as.h2o(training)  #Typo fix
test_h2o <- as.h2o(test)

feature_names <- names(training)[1:(ncol(training)-1)]  #typo fix
y = "Species" #The column we want to predict

train_h2o[,y] <- as.factor(train_h2o[,y])
test_h2o[,y] <- as.factor(test_h2o[,y])

然后使用以下方法制作模型：

model_dl <- h2o.deeplearning(x = feature_names, y = y, training_frame = train_h2o, stopping_metric = "logloss")

獲取預測：

pred_dl <- predict(model_dl, newdata = test_h2o)  #Typo fix

並使用以下方法將正確答案與預測進行比較：

cbind(test[, y], as.data.frame(pred_dl$predict))

（順便說一句，H2O 總是完美地將 Iris 數據集列檢測為數字與因子，因此不需要上面的as.factor()行；您的錯誤消息一定是在原始數據上。）

StackOverflow 建議：完整地測試您的可重現示例，並復制並粘貼該確切代碼，以及該代碼提供給您的確切錯誤消息。 你的代碼有很多小錯別字。 例如train的地方， training他人。 未給出createDataPartition() ； 我假設a = sample(nrow(iris), 0.8*nrow(iris)) 。 test沒有“id”列。

其他 H2O 建議：

在h2o.removeAll()之后運行h2o.removeAll() h2o.init() 。 如果之前運行，它會給你一條錯誤消息。 （我個人避免使用該功能 - 這是一種錯誤地留在生產腳本中的東西......）
考慮提前將數據導入 h2o，並使用h2o.splitFrame()進行拆分。 即避免在 R 中做 H2O 可以輕松處理的事情。
如果可以，盡量避免將數據放在 R 中。 更喜歡 importFile() 而不是 as.h2o()。

超越最后兩點的想法是 H2O 將擴展到超出一台機器的內存，而 R 不會。 與試圖在兩個地方跟蹤相同的事情相比，它也沒有那么混亂。

Answer 2

我有同樣的問題，但可以很容易地解決它。

我的錯誤發生是因為我在初始化 h2o-cluster 之前讀入了 h2o-object。 所以我訓練了一個 h2o-model，保存它，關閉集群，加載模型，然后再次初始化集群。

在讀入 h2o 對象之前，您應該已經初始化集群 (h2o.init())。

H2O：在參數“模型”的函數“預測”中找不到深度學習對象

問題描述

2 個解決方案

解決方案1
1 2017-05-12 08:14:24

解決方案2
0 2021-03-30 09:57:12

H2O：在參數“模型”的函數“預測”中找不到深度學習對象

問題描述

2 個解決方案

解決方案1 1 2017-05-12 08:14:24

解決方案2 0 2021-03-30 09:57:12

解決方案1
1 2017-05-12 08:14:24

解決方案2
0 2021-03-30 09:57:12