[英]mutate and predict.gam results Na/NaN/inf problem
I have an issue with some data, and I simply cannot understand why.我对一些数据有疑问,我根本不明白为什么。
I'm trying to estimate var4
from var3
using a GAM.我正在尝试使用 GAM 从
var3
估计var4
。
Here is the dataset I'm using to obtain my model:这是我用来获取模型的数据集:
for_model <- read.csv("https://raw.githubusercontent.com/fredlm/mockup/master/for_model.csv")
And the dataset in which I want to estimate var4
:以及我想估计
var4
的数据集:
for_est <- read.csv("https://raw.githubusercontent.com/fredlm/mockup/master/for_est.csv")
What I've done, simply:我所做的,很简单:
for_est <- for_est %>%
mutate(var4 = ifelse(!var3 == 0, predict.gam(gam(var4 ~ s(log(var3)), data = for_model), newdata = .), NA))
It returns the following error:它返回以下错误:
Error: Problem with mutate()
column var4
.错误:
mutate()
列var4
。 var4 = predict.gam(gam(var4 ~ s(log(var3)), data = for_model), newdata = .)
. var4 = predict.gam(gam(var4 ~ s(log(var3)), data = for_model), newdata = .)
。 x NA/NaN/Inf in foreign function call (arg 1) x NA/NaN/Inf 在外部函数调用中 (arg 1)
Despite a thorough research on the web and a few hours spent on my data, I can't find how to fix this...尽管在网络上进行了彻底的研究并在我的数据上花费了几个小时,但我找不到如何解决这个问题......
However, when I plot the GAM, things work great:但是,当我绘制 GAM 时,一切都很好:
ggplot(data = for_model,
aes(x = var3,
y = var4)) +
geom_point() +
geom_smooth(method = "gam",
formula = y~s(log(x)))
Any idea how to fix this?知道如何解决这个问题吗? I've looked for NaN or Inf values but there are none.
我一直在寻找 NaN 或 Inf 值,但没有。 Also, when I'm trying to estimate
var4
from var2
— which is VERY similar to var3
- things work well...此外,当我试图从
var2
估计var4
时——这与var3
非常相似——事情运行良好......
for_est <- for_est %>%
mutate(var4 = ifelse(!var2 == 0, predict.gam(gam(var4 ~ s(log(var2)), data = for_model), newdata = .), NA))
Thanks a lot!非常感谢!
ps: my apologies for the rather large files, but given that I don't understand the problem, I thought it might make more sense to provide all of them... :) ps:我为相当大的文件道歉,但鉴于我不明白这个问题,我认为提供所有这些文件可能更有意义...... :)
When you use ifelse to keep away from var3 == 0, you need to restrict the for_est
input data the same way.当你使用ifelse远离var3 == 0时,你需要以同样的方式限制
for_est
输入数据。 (I split up the model solving from the predicting, just to make testing faster, that doesn't matter) (我将模型求解与预测分开,只是为了使测试更快,这无关紧要)
gamfit <- gam(var4 ~ s(log(var3)), data = for_model)
for_est <- for_est %>%
mutate(var4 = ifelse(var3 != 0, predict(gamfit, newdata = .[var3 != 0, ]), NA_real_))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.