简体   繁体   English

mutate 和 predict.gam 结果 Na/NaN/inf 问题

[英]mutate and predict.gam results Na/NaN/inf problem

I have an issue with some data, and I simply cannot understand why.我对一些数据有疑问,我根本不明白为什么。

I'm trying to estimate var4 from var3 using a GAM.我正在尝试使用 GAM 从var3估计var4

Here is the dataset I'm using to obtain my model:这是我用来获取模型的数据集:

for_model <- read.csv("https://raw.githubusercontent.com/fredlm/mockup/master/for_model.csv")

And the dataset in which I want to estimate var4 :以及我想估计var4的数据集:

for_est <- read.csv("https://raw.githubusercontent.com/fredlm/mockup/master/for_est.csv")

What I've done, simply:我所做的,很简单:

for_est <- for_est %>%
mutate(var4 = ifelse(!var3 == 0, predict.gam(gam(var4 ~ s(log(var3)), data = for_model), newdata = .), NA))

It returns the following error:它返回以下错误:

Error: Problem with mutate() column var4 .错误: mutate()var4 var4 = predict.gam(gam(var4 ~ s(log(var3)), data = for_model), newdata = .) . var4 = predict.gam(gam(var4 ~ s(log(var3)), data = for_model), newdata = .) x NA/NaN/Inf in foreign function call (arg 1) x NA/NaN/Inf 在外部函数调用中 (arg 1)

Despite a thorough research on the web and a few hours spent on my data, I can't find how to fix this...尽管在网络上进行了彻底的研究并在我的数据上花费了几个小时,但我找不到如何解决这个问题......

However, when I plot the GAM, things work great:但是,当我绘制 GAM 时,一切都很好:

ggplot(data = for_model,
       aes(x = var3,
           y = var4)) +
  geom_point() +
  geom_smooth(method = "gam",
              formula = y~s(log(x)))

Any idea how to fix this?知道如何解决这个问题吗? I've looked for NaN or Inf values but there are none.我一直在寻找 NaN 或 Inf 值,但没有。 Also, when I'm trying to estimate var4 from var2 — which is VERY similar to var3 - things work well...此外,当我试图从var2估计var4时——这与var3非常相似——事情运行良好......

for_est <- for_est %>%
  mutate(var4 = ifelse(!var2 == 0, predict.gam(gam(var4 ~ s(log(var2)), data = for_model), newdata = .), NA))

Thanks a lot!非常感谢!

ps: my apologies for the rather large files, but given that I don't understand the problem, I thought it might make more sense to provide all of them... :) ps:我为相当大的文件道歉,但鉴于我不明白这个问题,我认为提供所有这些文件可能更有意义...... :)

When you use ifelse to keep away from var3 == 0, you need to restrict the for_est input data the same way.当你使用ifelse远离var3 == 0时,你需要以同样的方式限制for_est输入数据。 (I split up the model solving from the predicting, just to make testing faster, that doesn't matter) (我将模型求解与预测分开,只是为了使测试更快,这无关紧要)

gamfit <- gam(var4 ~ s(log(var3)), data = for_model)
for_est <- for_est %>%
  mutate(var4 = ifelse(var3 != 0, predict(gamfit, newdata = .[var3 != 0, ]), NA_real_))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM