Creating a linear regression model for each group in a column

Question

I refer to this answer: https://stackoverflow.com/a/65076441/14436230

I am trying to predict the "Education" value for 2019 using past values for each year, using lm(Education ~ poly(TIME,2)) .

However, I will have to apply this lm named function(TIME) to each "LOCATION", which I was able to create separate lm for each LOCATION in m .

Following the answer in the link attached, I was able to run my code until my_predict . When I run sapply , I get an error Error in UseMethod("predict") : no applicable method for 'predict' applied to an object of class "list"

Can someone advise me on my mistake? I will really appreciate any help.


linear_model <- function(TIME) lm(Education ~ poly(TIME,2), data=table2)

m <- lapply(split(table2,table2$LOCATION),linear_model)

new_df <- data.frame(TIME=c(2019))

my_predict <- function(TIME) predict(m,new_df)

sapply(m,my_predict)   #error here

Answer 1

Are you looking for such a solution?

library(tidyverse)
library(broom)
df %>% 
  mutate(LOCATION = as_factor(LOCATION)) %>% 
  group_by(LOCATION) %>% 
  group_split() %>% 
  map_dfr(.f = function(df){
    lm(Education ~ TIME, data = df) %>% 
      glance() %>% 
      add_column(LOCATION = unique(df$LOCATION), .before=1)
  })

  LOCATION r.squared adj.r.squared sigma statistic p.value    df logLik   AIC   BIC deviance df.residual  nobs
  <fct>        <dbl>         <dbl> <dbl>     <dbl>   <dbl> <dbl>  <dbl> <dbl> <dbl>    <dbl>       <int> <int>
1 AUT         0.367         0.261   4.88     3.47    0.112     1  -22.9  51.8  52.0    143.            6     8
2 BEL         0.0225       -0.173   3.90     0.115   0.748     1  -18.3  42.6  42.4     76.0           5     7
3 CZE         0.0843       -0.0683  3.22     0.552   0.485     1  -19.6  45.1  45.3     62.2           6     8

Answer 2

You have some mistakes in the syntax of your functions. Functions are usually written as function(x), and then you substitute the x with the data you want to use it with.

For example, in the linear_model function you defined, if you were to use it alone you would write:

linear_model(data)

However, because you are using it inside the lapply function it is a bit more tricky to see. Lapply is just making a loop and applying the linear_model function to each of the data frames you obtain from split(table2,table2$LOCATION) .

The same thing happens with my_predict .

Anyway, this should work for you:

linear_model <- function(x) lm(Education ~ TIME, x)

m <- lapply(split(table2,table2$LOCATION),linear_model)

new_df <- data.frame(TIME=c(2019))

my_predict <- function(x) predict(x,new_df)

sapply(m,my_predict)

Creating a linear regression model for each group in a column

Question

2 answers

solution1
0 2021-10-29 10:20:04

solution2
0 ACCPTED 2021-10-29 10:56:17

Creating a linear regression model for each group in a column

Question

2 answers

solution1 0 2021-10-29 10:20:04

solution2 0 ACCPTED 2021-10-29 10:56:17

solution1
0 2021-10-29 10:20:04

solution2
0 ACCPTED 2021-10-29 10:56:17