Linear regression model and interaction's terms problem

Question

I'm totally new to R and statistic as well.

Using the crabs.csv dataset I made a linear regression model using this code:

facdF = dF %>% mutate(sex = factor(sex, labels = c("F", "M")))
fac2dF = dF %>% mutate(sp = factor(sp, labels = c("O", "B")))

dfS <- summary(with(fac2dF, lm(as.numeric(gsub(",",".",CL)) ~ sex*sp)))

With dF being the original dataframe. When I run the code, I get as output for the summary of the model:

Call:
lm(formula = as.numeric(gsub(",", ".", CL)) ~ sex * sp)

Residuals:
    Min      1Q  Median      3Q     Max 
-16.430  -4.423  -0.065   5.378  14.570 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   27.965      1.100  25.421  < 2e-16 ***
sexM           4.565      1.587   2.876  0.00462 ** 
spO            6.507      1.598   4.071 7.61e-05 ***
sexM:spO      -5.963      2.266  -2.631  0.00942 ** 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 6.957 on 147 degrees of freedom
Multiple R-squared:  0.115, Adjusted R-squared:  0.09692 
F-statistic: 6.366 on 3 and 147 DF,  p-value: 0.0004364

I would basically like to ask you two questions:

Why is the estimate for sexM:spO negative?
I must extract a prediction for sexF:spO , but the interaction term is not present. How can I do it?

EDIT: As requested, the file crabs.csv is available at github

Answer 1

I've resolved it by just selecting factors with:

facdF = dF %>% mutate(sex = factor(sex, labels = c("F")))
fac2dF = dF %>% mutate(sp = factor(sp, labels = c("O")))

instead of:

facdF = dF %>% mutate(sex = factor(sex, labels = c("M", "F")))
fac2dF = dF %>% mutate(sp = factor(sp, labels = c("B", "O")))

Answer 2

The reason that sexM:spO is negative is because that's the direction of the effect in the data. That is, individuals with male sex have 5 lower CL if they are species O, relative to species B.

library("MASS")
data(crabs)

fit <- lm(CL ~ sex * sp, data = crabs)
summary(fit)
#> 
#> Call:
#> lm(formula = CL ~ sex * sp, data = crabs)
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -16.988  -4.636   0.184   5.130  15.086 
#> 
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)  28.1020     0.9499  29.584  < 2e-16 ***
#> sexM          3.9120     1.3434   2.912  0.00401 ** 
#> spO           6.5160     1.3434   4.851  2.5e-06 ***
#> sexM:spO     -4.8420     1.8998  -2.549  0.01158 *  
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 6.717 on 196 degrees of freedom
#> Multiple R-squared:  0.1232, Adjusted R-squared:  0.1098 
#> F-statistic: 9.181 on 3 and 196 DF,  p-value: 1.031e-05

The reason sexM shows up is because by categorical variables are treated like factors in R models, and in R factors have alphabetical levels by default. To get the interaction term sexF:spO, you would have to ensure the variable being passed in has M as the first level and F as the second, something like:

crabs$sex <- factor(crabs$sex, levels = c("M", "F"))
fit <- lm(CL ~ sex * sp, data = crabs)
summary(fit)
#> 
#> Call:
#> lm(formula = CL ~ sex * sp, data = crabs)
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -16.988  -4.636   0.184   5.130  15.086 
#> 
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)  32.0140     0.9499  33.702  < 2e-16 ***
#> sexF         -3.9120     1.3434  -2.912  0.00401 ** 
#> spO           1.6740     1.3434   1.246  0.21420    
#> sexF:spO      4.8420     1.8998   2.549  0.01158 *  
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 6.717 on 196 degrees of freedom
#> Multiple R-squared:  0.1232, Adjusted R-squared:  0.1098 
#> F-statistic: 9.181 on 3 and 196 DF,  p-value: 1.031e-05

This flips the sign of the interaction term.

Linear regression model and interaction's terms problem

Question

2 answers

solution1
0 2022-04-08 20:45:28

solution2
0 ACCPTED 2022-04-08 23:18:33

Linear regression model and interaction's terms problem

Question

2 answers

solution1 0 2022-04-08 20:45:28

solution2 0 ACCPTED 2022-04-08 23:18:33

solution1
0 2022-04-08 20:45:28

solution2
0 ACCPTED 2022-04-08 23:18:33