mutate 和 predict.gam 结果 Na/NaN/inf 问题

Question

I have an issue with some data, and I simply cannot understand why.我对一些数据有疑问，我根本不明白为什么。

I'm trying to estimate var4 from var3 using a GAM.我正在尝试使用 GAM 从var3估计var4 。

Here is the dataset I'm using to obtain my model:这是我用来获取模型的数据集：

for_model <- read.csv("https://raw.githubusercontent.com/fredlm/mockup/master/for_model.csv")

And the dataset in which I want to estimate var4 :以及我想估计var4的数据集：

for_est <- read.csv("https://raw.githubusercontent.com/fredlm/mockup/master/for_est.csv")

What I've done, simply:我所做的，很简单：

for_est <- for_est %>%
mutate(var4 = ifelse(!var3 == 0, predict.gam(gam(var4 ~ s(log(var3)), data = for_model), newdata = .), NA))

It returns the following error:它返回以下错误：

Error: Problem with mutate() column var4 .错误： mutate()列var4 。 var4 = predict.gam(gam(var4 ~ s(log(var3)), data = for_model), newdata = .) . var4 = predict.gam(gam(var4 ~ s(log(var3)), data = for_model), newdata = .) 。 x NA/NaN/Inf in foreign function call (arg 1) x NA/NaN/Inf 在外部函数调用中 (arg 1)

Despite a thorough research on the web and a few hours spent on my data, I can't find how to fix this...尽管在网络上进行了彻底的研究并在我的数据上花费了几个小时，但我找不到如何解决这个问题......

However, when I plot the GAM, things work great:但是，当我绘制 GAM 时，一切都很好：

ggplot(data = for_model,
       aes(x = var3,
           y = var4)) +
  geom_point() +
  geom_smooth(method = "gam",
              formula = y~s(log(x)))

Any idea how to fix this?知道如何解决这个问题吗？ I've looked for NaN or Inf values but there are none.我一直在寻找 NaN 或 Inf 值，但没有。 Also, when I'm trying to estimate var4 from var2 — which is VERY similar to var3 - things work well...此外，当我试图从var2估计var4时——这与var3非常相似——事情运行良好......

for_est <- for_est %>%
  mutate(var4 = ifelse(!var2 == 0, predict.gam(gam(var4 ~ s(log(var2)), data = for_model), newdata = .), NA))

Thanks a lot!非常感谢！

ps: my apologies for the rather large files, but given that I don't understand the problem, I thought it might make more sense to provide all of them... :) ps：我为相当大的文件道歉，但鉴于我不明白这个问题，我认为提供所有这些文件可能更有意义...... :)

Answer 1

When you use ifelse to keep away from var3 == 0, you need to restrict the for_est input data the same way.当你使用ifelse远离var3 == 0时，你需要以同样的方式限制for_est输入数据。 (I split up the model solving from the predicting, just to make testing faster, that doesn't matter) （我将模型求解与预测分开，只是为了使测试更快，这无关紧要）

gamfit <- gam(var4 ~ s(log(var3)), data = for_model)
for_est <- for_est %>%
  mutate(var4 = ifelse(var3 != 0, predict(gamfit, newdata = .[var3 != 0, ]), NA_real_))

mutate 和 predict.gam 结果 Na/NaN/inf 问题

问题描述

1 个解决方案

解决方案1
0 2021-10-29 23:21:15

mutate 和 predict.gam 结果 Na/NaN/inf 问题

问题描述

1 个解决方案

解决方案1 0 2021-10-29 23:21:15

解决方案1
0 2021-10-29 23:21:15