简体   繁体   English

如何消除“外部函数调用中的 NA/NaN/Inf (arg 7)” 使用 randomForest 运行预测

[英]How to eliminate “NA/NaN/Inf in foreign function call (arg 7)” running predict with randomForest

I have researched this extensively without finding a solution.我对此进行了广泛的研究,但没有找到解决方案。 I have cleaned my data set as follows:我已经清理了我的数据集如下:

library("raster")
impute.mean <- function(x) replace(x, is.na(x) | is.nan(x) | is.infinite(x) , 
mean(x, na.rm = TRUE))
losses <- apply(losses, 2, impute.mean)
colSums(is.na(losses))
isinf <- function(x) (NA <- is.infinite(x))
infout <- apply(losses, 2, is.infinite)
colSums(infout)
isnan <- function(x) (NA <- is.nan(x))
nanout <- apply(losses, 2, is.nan)
colSums(nanout)

The problem arises running the predict algorithm:运行预测算法时出现问题:

options(warn=2)
p  <-   predict(default.rf, losses, type="prob", inf.rm = TRUE, na.rm=TRUE, nan.rm=TRUE)

All the research says it should be NA's or Inf's or NaN's in the data but I don't find any.所有的研究都说它应该是数据中的 NA 或 Inf 或 NaN,但我没有找到。 I am making the data and the randomForest summary available for sleuthing at [deleted] Traceback doesn't reveal much (to me anyway):我正在使数据和 randomForest 摘要可用于在 [deleted] Traceback 进行侦查并没有透露太多信息(无论如何对我来说):

4: .C("classForest", mdim = as.integer(mdim), ntest = as.integer(ntest), 
       nclass = as.integer(object$forest$nclass), maxcat = as.integer(maxcat), 
       nrnodes = as.integer(nrnodes), jbt = as.integer(ntree), xts = as.double(x), 
       xbestsplit = as.double(object$forest$xbestsplit), pid = object$forest$pid, 
       cutoff = as.double(cutoff), countts = as.double(countts), 
       treemap = as.integer(aperm(object$forest$treemap, c(2, 1, 
           3))), nodestatus = as.integer(object$forest$nodestatus), 
       cat = as.integer(object$forest$ncat), nodepred = as.integer(object$forest$nodepred), 
       treepred = as.integer(treepred), jet = as.integer(numeric(ntest)), 
       bestvar = as.integer(object$forest$bestvar), nodexts = as.integer(nodexts), 
       ndbigtree = as.integer(object$forest$ndbigtree), predict.all = as.integer(predict.all), 
       prox = as.integer(proximity), proxmatrix = as.double(proxmatrix), 
       nodes = as.integer(nodes), DUP = FALSE, PACKAGE = "randomForest")
3: predict.randomForest(default.rf, losses, type = "prob", inf.rm = TRUE, 
       na.rm = TRUE, nan.rm = TRUE)
2: predict(default.rf, losses, type = "prob", inf.rm = TRUE, na.rm = TRUE, 
       nan.rm = TRUE)
1: predict(default.rf, losses, type = "prob", inf.rm = TRUE, na.rm = TRUE, 
       nan.rm = TRUE)

Your code is not entirely reproducible (there's no running of the actual randomForest algorithm) but you are not replacing Inf values with the means of column vectors.您的代码不是完全可重现的(没有运行实际的randomForest算法),但您没有用列向量的平均值替换Inf值。 This is because the na.rm = TRUE argument in the call to mean() within your impute.mean function does exactly what it says -- removes NA values (and not Inf ones).这是因为在您的impute.mean函数中调用mean()na.rm = TRUE参数完全按照它所说的去做——删除NA值(而不是Inf值)。

You can see this, for example, by:例如,您可以通过以下方式查看:

impute.mean <- function(x) replace(x, is.na(x) | is.nan(x) | is.infinite(x), mean(x, na.rm = TRUE))
losses <- apply(losses, 2, impute.mean)
sum( apply( losses, 2, function(.) sum(is.infinite(.))) )
# [1] 696

To get rid of infinite values, use:要摆脱无限值,请使用:

impute.mean <- function(x) replace(x, is.na(x) | is.nan(x) | is.infinite(x), mean(x[!is.na(x) & !is.nan(x) & !is.infinite(x)]))
losses <- apply(losses, 2, impute.mean)
sum(apply( losses, 2, function(.) sum(is.infinite(.)) ))
# [1] 0

One cause of the error message:错误消息的原因之一:

NA/NaN/Inf in foreign function call (arg X)外部函数调用中的 NA/NaN/Inf (arg X)

When training a randomForest is having character -class variables in your data.frame.训练 randomForest 时,data.frame 中有character类变量。 If it comes with the warning:如果它带有警告:

NAs introduced by coercion强制引入的 NA

Check to make sure that all of your character variables have been converted to factors.检查以确保所有字符变量都已转换为因子。

Example例子

set.seed(1)
dat <- data.frame(
  a = runif(100),
  b = rpois(100, 10),
  c = rep(c("a","b"), 100),
  stringsAsFactors = FALSE
)

library(randomForest)
randomForest(a ~ ., data = dat)

Yields:产量:

Error in randomForest.default(m, y, ...) : NA/NaN/Inf in foreign function call (arg 1) In addition: Warning message: In data.matrix(x) : NAs introduced by coercion randomForest.default(m, y, ...) 中的错误:外部函数调用中的 NA/NaN/Inf (arg 1) 另外:警告消息:在 data.matrix(x) 中:由强制引入的 NA

But switch it to stringsAsFactors = TRUE and it runs.但是将其切换为stringsAsFactors = TRUE并运行。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何在 bigglm 中消除“外部函数调用(arg 3)中的 NA/NaN/Inf” - How to eliminate “NA/NaN/Inf in foreign function call (arg 3)” in bigglm 外部函数调用中的 NA/NaN/Inf (arg 5) - NA/NaN/Inf in foreign function call (arg 5) 外国函数调用中的NA / NaN / Inf(arg 6) - NA/NaN/Inf in foreign function call (arg 6) randomForest.default(m, y, ...) 中 coercionError 引入的 NA:外部函数调用中的 NA/NaN/Inf (arg 1) - NAs introduced by coercionError in randomForest.default(m, y, ...) : NA/NaN/Inf in foreign function call (arg 1) R:vegdist的外部函数调用(arg 1)中的NA / NaN / Inf - R: NA/NaN/Inf in foreign function call (arg 1) for vegdist KNN 算法中的外部函数调用 (arg 6) 中的 NA/NaN/Inf - NA/NaN/Inf in foreign function call (arg 6) in KNN Algorithm Montecarlo和Rstudio错误:外部函数调用中的NA / NaN / Inf(arg 5) - Montecarlo and rstudio error: NA/NaN/Inf in foreign function call (arg 5) glmer错误:外部函数调用(arg 1)中的NA / NaN / Inf - Error in glmer: NA/NaN/Inf in foreign function call (arg 1) 如何解决“ do_one(nmeth)中的错误:外部函数调用中的NA / NaN / Inf(arg 1)” - how to fix “Error in do_one(nmeth) : NA/NaN/Inf in foreign function call (arg 1)” 如何修复以下“hclust 中的错误(d,方法 = hclustfun):外国 function 调用中的 NA/NaN/Inf(参数 11)” - How to fix the following “Error in hclust(d, method = hclustfun) : NA/NaN/Inf in foreign function call (arg 11)”
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM