简体   繁体   English

可变长度不同(为“x”找到)

[英]variable lengths differ (found for 'x')

I've seen several cases of this error, but none of them seem to solve or apply to my situation.我见过几个这个错误的案例,但似乎没有一个能解决或适用于我的情况。

I am building a logistic regression model with biglm .我正在使用biglm构建逻辑回归模型。

I have a data.frame with ~250 variables and a little over a million rows.我有一个包含约 250 个变量和超过一百万行的 data.frame。

Since bigglm() doesn't work with the dot notation to select all variables in the model I am building my formula like this .由于bigglm()不能使用点符号来选择模型中的所有变量,因此我正在构建这样的公式。

So if f is my formula and df is my dataframe, then my model looks like this:因此,如果f是我的公式, df是我的数据框,那么我的模型如下所示:

fit <- bigglm(f, data = df, family=binomial(link="logit"), chunksize=100, maxit=10)

And I get the error: variable lengths differ (found for 'x')我得到了错误: variable lengths differ (found for 'x')

When I check for length of x it is exactly the same as length of df .当我检查x的长度时,它与df的长度完全相同。

Other StackOverflow questions seem to suggest it might be a problem with the way the formula is constructed.其他 StackOverflow 问题似乎表明公式的构建方式可能存在问题。 Or perhaps it is a problem with biglm?或者也许是biglm?

I was able to solve this issue by making a slight modification in the way I was constructing my formula for bigglm()我能够通过对构建bigglm()公式的方式进行轻微修改来解决这个问题

As shown in the link attached in my question, I was constructing the formula like this:如我的问题中附加的链接所示,我正在构建这样的公式:

n <- names(df)
f <- as.formula(paste("y ~", paste(n[!n %in% "y"], collapse = " + ")))

What f was missing was the df$ before each variable name in the formula. f缺少的是公式中每个变量名之前的df$ Modifying the as.formula() function to concatenate "df$" to each variable name fixed this issue.修改as.formula()函数以将"df$"连接到每个变量名修复了这个问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM