[英]Regression analysis with separateing group in R
In my dataset, there are two group variables shop and art
here data example 在我的数据集中,有两个组变量
shop and art
这里的数据示例
read.csv(reg.csv)
structure(list(shop = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L), .Label = c("a", "c"), class = "factor"), art = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("b", "d"), class = "factor"),
Y = c(177L, 122L, 175L, 140L, 201L, 202L, 279L, 253L, 236L,
137L, 166L, 241L, 195L, 221L, 238L, 203L, 254L, 219L, 101L,
157L, 188L, 219L, 267L, 126L, 291L, 239L, 230L), x1 = c(1L,
0L, 1L, 0L, 1L, 1L, 1L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L,
0L, 0L, 0L, 1L, 1L, 1L, 0L, 0L, 1L, 0L, 1L), x2 = c(0L, 1L,
1L, 0L, 1L, 0L, 1L, 1L, 1L, 1L, 0L, 1L, 1L, 0L, 1L, 1L, 0L,
1L, 1L, 0L, 0L, 1L, 1L, 0L, 1L, 0L, 1L), x3 = c(0L, 0L, 0L,
1L, 1L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 0L, 1L, 0L, 1L, 0L,
1L, 0L, 1L, 1L, 0L, 1L, 0L, 1L, 0L), x4 = c(0L, 0L, 1L, 1L,
0L, 0L, 0L, 1L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 1L, 1L,
0L, 0L, 0L, 1L, 1L, 0L, 1L, 1L), x5 = c(0L, 0L, 1L, 1L, 0L,
0L, 1L, 1L, 1L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 1L, 1L,
1L, 0L, 0L, 1L, 1L, 1L, 0L), x6 = c(0L, 1L, 0L, 0L, 1L, 1L,
0L, 0L, 1L, 1L, 1L, 0L, 0L, 1L, 1L, 0L, 1L, 0L, 1L, 1L, 0L,
1L, 1L, 1L, 1L, 0L, 1L), x7 = c(1L, 1L, 0L, 0L, 1L, 0L, 0L,
0L, 0L, 0L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L,
0L, 1L, 1L, 1L, 0L), x8 = c(0L, 0L, 0L, 1L, 1L, 0L, 0L, 1L,
1L, 1L, 0L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 1L, 1L, 0L, 1L,
0L, 1L, 0L, 1L), x9 = c(1L, 1L, 0L, 1L, 1L, 0L, 1L, 0L, 1L,
0L, 0L, 1L, 0L, 0L, 1L, 1L, 0L, 1L, 1L, 0L, 0L, 0L, 1L, 0L,
1L, 1L, 0L)), .Names = c("shop", "art", "Y", "x1", "x2",
"x3", "x4", "x5", "x6", "x7", "x8", "x9"), class = "data.frame", row.names = c(NA,
-27L))
I need perform regression analysis for all groups separately. 我需要对所有组分别进行回归分析。 The formula is simple
公式很简单
mymodel=lm(y~.,data=reg)
Ie i must perform analysis for a+b
group and c+d
group separately. 即我必须分别对
a+b
组和c+d
组进行分析。 In this example we have only 2 groups(a+b and c+d) where a,c-mean name of shop, and b,d -mean name of vendor code. 在此示例中,我们只有2个组(a + b和c + d),其中a,c表示商店名称,b,d表示供应商代码名称。
How can i perform regression ananysis separately by groups, cause in real data, there are several ten groups, manually divide on the datasets it's impossible. 我如何按组分别执行回归分析,导致真实数据中有十个组,因此无法对数据集进行手动划分。
This is a relatively common analytical pattern called split - apply - combine and it is fairly easy to perform with R: 这是一种相对普遍的分析模式,称为拆分-应用-合并,并且使用R相当容易执行:
library(tidyverse)
library(broom)
create a function for lm: 为lm创建一个函数:
my_lm <- function(df) {
lm(Y ~ ., data = df)
}
run the models on nested groups of data: 在嵌套的数据组上运行模型:
df %>%
group_by(art, shop) %>%
nest() %>%
mutate(fit = map(data, my_lm),
tidy = map(fit, tidy)) %>%
select(-fit, - data) %>%
unnest()
First you group by the variables by the desired variables, fit the lm model to the groups use tidy to extract the coefficients, remove unwanted columns and then unnest. 首先,按所需变量对变量进行分组,使lm模型适合各组,然后使用整洁的方法提取系数,删除不需要的列,然后进行嵌套。 The result is:
结果是:
#output
art shop term estimate std.error statistic p.value
<fctr> <fctr> <chr> <dbl> <dbl> <dbl> <dbl>
1 b a (Intercept) 31.0 269 0.115 0.927
2 b a x1 109 153 0.714 0.605
3 b a x2 - 23.0 223 -0.103 0.934
4 b a x3 - 15.0 185 -0.0810 0.949
5 b a x4 31.0 333 0.0931 0.941
6 b a x5 81.0 457 0.177 0.888
7 b a x6 77.0 162 0.475 0.718
8 b a x7 - 17.0 310 -0.0548 0.965
9 b a x8 - 15.0 214 -0.0700 0.956
10 b a x9 54.0 349 0.155 0.902
11 d c (Intercept) 199 98.8 2.01 0.0907
12 d c x1 - 15.7 60.8 -0.259 0.804
13 d c x2 5.98 48.8 0.123 0.906
14 d c x3 7.34 57.8 0.127 0.903
15 d c x4 - 20.1 53.8 -0.373 0.722
16 d c x5 - 43.2 41.8 -1.03 0.342
17 d c x6 1.93 34.5 0.0560 0.957
18 d c x7 31.9 40.5 0.787 0.461
19 d c x8 36.0 45.9 0.786 0.462
20 d c x9 10.7 49.7 0.215 0.837
There are many tutorials using the same or similar approach like the one I posted in my comment. 有许多教程使用与我在评论中张贴的方法相同或相似的方法。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.