繁体   English   中英

循环应用lm后出现错误

[英]Error after applying lm in loop

我正在尝试应用以下代码,它可以对没有NA值的任何数据正常工作。 但是,当我包含具有NA值的数据时,会收到以下消息:

Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : 0 (non-NA) cases

我使用的代码是:

    m <- data.frame(matrix(ncol = 5, nrow = length(unique(df$Year))*length(unique(df$Firm))))
    enter code here
l = 0
for(i in unique(df$Year)) {
  for(j in unique(df$Firm)) {
    l = l + 1
    mod<-lm(Ri ~ RM + Rz, data = df, subset = df$Year==i & df$Firm ==j)
    m[l,] <- c(i,
               as.character(j), 
               mod$coefficients[2],
               mod$coefficients[3],
               summary(mod)$sigma)
  }
}
names(m) <- c("Year", "Firm", "B1", "B2","e")

这是关于我正在使用的数据的示例:

Year   Firm    Ri    Rm    Rz
2009   A       30    55    NA
2009   A       0     55    NA
2009   A       1     55    NA
2010   A       7     55    85
2010   A       15    NA    85
2011   A       0     55    85
2011   A       3.5   55    85
2011   A       8     NA    85
2009   B       24    55    85
2009   B       30    55    85
2009   B       25    55    85
2010   B       5.2   NA    85
2010   B       11.8  55    85
2011   B       0     55    NA
2011   B       90    55    NA
2011   B       57    55    NA

有什么建议么 ???

除了上面的数据问题外,您还可以结合使用dplyrbroom软件包重新编写代码,如下所示:

library(dplyr)
library(tidyr)
df$Rz <- 85 # Imput values of Rz to make the code work
df %>% group_by(Year, Firm) %>% do(tidy(lm(Ri ~ Rm + Rz, data = .)))

Source: local data frame [6 x 7]
Groups: Year, Firm [6]

   Year   Firm        term estimate std.error statistic     p.value
  <int> <fctr>       <chr>    <dbl>     <dbl>     <dbl>       <dbl>
1  2009      A (Intercept) 10.33333  9.837570  1.050395 0.403735888
2  2009      B (Intercept) 26.33333  1.855921 14.188819 0.004930448
3  2010      A (Intercept)  7.00000       NaN       NaN         NaN
4  2010      B (Intercept) 11.80000       NaN       NaN         NaN
5  2011      A (Intercept)  1.75000  1.750000  1.000000 0.500000000
6  2011      B (Intercept) 49.00000 26.286879  1.864048 0.203331016

更新:添加了一个过滤器选项,以便可以使用lm来将在一个或多个(独立变量)中不具有所有NA的Year / Firm组匹配:

df %>% group_by(Year, Firm) %>% filter(!all(is.na(Rm)) & !all(is.na(Rz))) %>% do(tidy(lm(Ri ~ Rm + Rz, data = .)))
Source: local data frame [4 x 7]
Groups: Year, Firm [4]

   Year   Firm        term estimate std.error statistic     p.value
  <int> <fctr>       <chr>    <dbl>     <dbl>     <dbl>       <dbl>
1  2009      B (Intercept) 26.33333  1.855921  14.18882 0.004930448
2  2010      A (Intercept)  7.00000       NaN       NaN         NaN
3  2010      B (Intercept) 11.80000       NaN       NaN         NaN
4  2011      A (Intercept)  1.75000  1.750000   1.00000 0.500000000

此输出仅显示截距模型拟合,因为提供的样本数据中没有其他可变性。 但是,如果您具有这种可变性(例如,在mtcars数据集上),则将获得以下输出:

mtcars %>% group_by(cyl) %>% do(tidy(lm(mpg ~ wt + am, data = mtcars)))
Source: local data frame [9 x 6]
Groups: cyl [3]

    cyl        term    estimate std.error   statistic      p.value
  <dbl>       <chr>       <dbl>     <dbl>       <dbl>        <dbl>
1     4 (Intercept) 37.32155131 3.0546385 12.21799285 5.843477e-13
2     4          wt -5.35281145 0.7882438 -6.79080719 1.867415e-07
3     4          am -0.02361522 1.5456453 -0.01527855 9.879146e-01
4     6 (Intercept) 37.32155131 3.0546385 12.21799285 5.843477e-13
5     6          wt -5.35281145 0.7882438 -6.79080719 1.867415e-07
6     6          am -0.02361522 1.5456453 -0.01527855 9.879146e-01
7     8 (Intercept) 37.32155131 3.0546385 12.21799285 5.843477e-13
8     8          wt -5.35281145 0.7882438 -6.79080719 1.867415e-07
9     8          am -0.02361522 1.5456453 -0.01527855 9.879146e-01

编辑:添加一个简单的示例来证明原始帖子中的问题:

x <- 1:10
y <- 1:10
z <- NA
df <- data.frame(x = x, y = y, z = z)
lm(x ~ y + z, data = df)
Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : 
  0 (non-NA) cases

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM