[英]Aggregating rmse and r2 in r
Here is a sample data as data2: 这是作为data2的示例数据:
lvl xy 0 20.099 21.2 100 21.133 21.4 250 20.866 21.6 500 22.679 21.8 750 22.737 22.1 0 30.396 32.0 100 31.373 32.1 250 31.303 32.2 500 33.984 32.8 750 44.563 38.0 0 22.755 18.5 100 23.194 18.8 250 23.263 20.5 500 23.061 27.9 750 25.678 36.4
I tried to get the rmse and r2 for each level (lvl) by the following lines of codes: data2 %>% group_by(lvl) %>% summarise_each(funs(rmse(data2$x~data2$y)))
and summary(lm(data2$x,data2$y))$r.squared
respectively, and I got the following error message when calculating rmse: 我试图通过以下代码行获取每个级别(lvl)的rmse和r2:
data2 %>% group_by(lvl) %>% summarise_each(funs(rmse(data2$x~data2$y)))
和summary(lm(data2$x,data2$y))$r.squared
,计算rmse时出现以下错误消息:
Error: argument "obs" is missing, with no default
错误:缺少参数“ obs”,没有默认值
and 和
# A tibble: 5 x 3 lvl xy <int> <dbl> <dbl> 1 0 0.6639888 0.6639888 2 100 0.6639888 0.6639888 3 250 0.6639888 0.6639888 4 500 0.6639888 0.6639888 5 750 0.6639888 0.6639888
when calculating r2. 在计算r2时。
I wanted to aggregate the rmse and r2 for each level. 我想汇总每个级别的rmse和r2。 In this case I have only 5 levels.So the answer will look like 5 rows X 3 columns with column names `"lvl","rmse","r2" Thank you in advance.
在这种情况下,我只有5个级别。因此答案看起来像5行X 3列,列名称为““ lvl”,“ rmse”,“ r2”。谢谢。
You don't need summarise_each
summarise will do what you want. 你不需要
summarise_each
总结会做你想要什么。 If you prefer using dplyr here is a solution 如果您更喜欢使用dplyr,这是一个解决方案
data2 <-
data.frame(
lvl = c( 0, 100, 250, 500, 750, 0, 100, 250, 500, 750, 0, 100, 250, 500, 750)
,x = c(
20.099, 21.133, 20.866, 22.679, 22.737, 30.396, 31.373, 31.303, 33.984, 44.563, 22.755, 23.194, 23.263, 23.061, 25.678
)
,y = c(21.2, 21.4, 21.6, 21.8, 22.1, 32.0, 32.1, 32.2, 32.8, 38.0, 18.5, 18.8, 20.5, 27.9, 36.4)
)
#install.packages("ModelMetrics")
library(ModelMetrics)
data2 %>%
group_by(lvl) %>%
summarise(
RMSE = rmse(x, y)
,R2 = cor(x, y)^2
)
## A tibble: 5 × 3
# lvl RMSE R2
# <dbl> <dbl> <dbl>
#1 0 2.701237 0.8176712
#2 100 2.575982 0.8645350
#3 250 1.729888 0.9091029
#4 500 2.920640 0.7207692
#5 750 7.267279 0.4542507
## split your data2 into a list by the levels of the factor and then use lapply
list_of_rsquared <- lapply(split(data2, data2$lvl), function (z) {
summary(lm(x ~ y, data = z))$r.squared
}
)
## you will get a list of r.squared for each level . Now you can simply rbind the list of r.squared.
rsquared_vals <- do.call("rbind", list_of_rsquared)
You can use the same approach for RMSE. 您可以对RMSE使用相同的方法。 (I am assuming you have written a function called RMSE? because I am just using the formula you have above)
(我假设您已经编写了一个称为RMSE的函数?因为我只是使用上面的公式)
list_of_rmse <- lapply(split(data2, data2$lvl), function (z) { sqrt(mean((z$x - z$y)^2)) } )
rmse_vals <- do.call("rbind", list_of_rmse)
you can just cbind
all three columns you need now: 您只需
cbind
现在需要的所有三列:
cbind(data2$lvl, rsquared_vals, rmse_vals)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.