简体   繁体   English

提取按因子分组的每个回归的R ^ 2(R平方)值

[英]Extract R^2 (R-squared) value for each regression grouped by a factor

I'm wondering if there is a way to extract R2 for each regression equation. 我想知道是否有办法为每个回归方程提取R2。

d <- data.frame(
  state = rep(c('NY', 'CA'), 10),
  year = rep(1:10, 2),
  response= rnorm(20)
)

library(plyr)
models <- dlply(d, "state", function(df) 
  lm(response ~ year, data = df))

ldply(models, coef)
l_ply(models, summary, .print = TRUE)

I tried 我试过了

l_ply(models, summary$r.squared, .print = TRUE)

But this throws the following error message 但是这会抛出以下错误消息

Error in summary$r.squared : object of type 'closure' is not subsettable

You can do this to get the R squared value and the coefficients: 您可以这样做以获得R平方值和系数:

ldply(models, function(x) {r.sq <- summary(x)$r.squared
                           intercept <- summary(x)$coefficients[1]
                           beta <- summary(x)$coefficients[2]
                           data.frame(r.sq, intercept, beta)})
#  state        r.sq intercept        beta
#1    CA 0.230696121 0.4915617 -0.12343947
#2    NY 0.003506936 0.1971734 -0.01227367

Using the broom package for converting statistical analysis objects into data.frames and dplyr for bind_rows : 使用扫帚包将统计分析对象转换为data.frames和dplyr for bind_rows

library(dplyr) ; library(broom)
cbind(
  state = attr(models, "split_labels"),
  bind_rows(lapply(models, function(x) cbind(
    intercept = tidy(x)$estimate[1],
    beta = tidy(x)$estimate[2],
    glance(x))))
)

  state  intercept        beta  r.squared adj.r.squared    sigma statistic   p.value df    logLik      AIC      BIC deviance df.residual
1    CA 0.38653551 -0.05459205 0.01427426   -0.10894146 1.434599 0.1158477 0.7423473  2 -16.68252 39.36505 40.27280 16.46460           8
2    NY 0.09028554 -0.08462742 0.04138985   -0.07843642 1.287909 0.3454155 0.5729312  2 -15.60387 37.20773 38.11549 13.26968           8

you can try this 你可以试试这个

sapply(models, function(x) summary(x)$r.squared)
     CA      NY 
0.05639 0.23751 

If you try 如果你试试

> typeof( summary )
[1] "closure"

you see that 'summary' is a function. 你看到'摘要'是一个功能。 You are trying to access a field of the result, but summary$r.squared tries to access that field on the function / closure. 您正在尝试访问结果的字段,但是summary$r.squared尝试访问函数/闭包上的该字段。

Using an anonymous function, 使用匿名函数,

> l_ply( models, function( m ) summary( m )$r.squared, .print = TRUE )
[1] 0.2319583
[1] 0.01295825

will work and print the result. 将工作和打印结果。 However, you say that you want to "extract the result". 但是,你说你想“提取结果”。 This probably means that you want to use the result and not just print it. 这可能意味着您想要使用结果而不仅仅是打印它。

From the documentation of l_ply (which you'll get by typing ?l_ply at the R prompt): l_ply的文档中(你可以在R提示符下键入?l_ply ):

For each element of a list, apply function and discard results. 对于列表的每个元素,应用函数并丢弃结果。

(So this function will not work if you want to hang on to the result.) (因此,如果要挂起结果,此功能将无效。)

Using the standard sapply / lapply will result in 使用标准的sapply / lapply将导致

> a <- sapply( models, function( t ) summary( t )$r.squared )
> a
        CA         NY 
0.23195825 0.01295825 
> typeof( a )
[1] "double"
> is.vector( a )
[1] TRUE
> # or alternatively
> l <- lapply( models, function( t ) summary( t )$r.squared )
> l
$CA
[1] 0.2319583

$NY
[1] 0.01295825
> typeof( l )
[1] "list"

Either one should work -- pick whichever result (vector or list) is easier to use for what you want to do. 任何一个应该工作 - 选择哪个结果(向量或列表)更容易用于你想做的事情。 (If unsure, just pick sapply .) (如果不确定,只需选择sapply 。)

(Or, if you want to use functions from the plyr package, laply , ldply , and llply seem to work too. But I've never used that package, so I can't say what's best.) (或者,如果你想使用的功能从plyr包, laplyldplyllply似乎工作过,但我从来没有用过这个包,所以我不能说什么是最好的。)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM