简体   繁体   English

循环数据帧列表中列的 lm 模型并输出显示斜率和 p 值的数据帧

[英]Looping lm models of column in a list of dataframes and outputting dataframes showing the slope and p values

I want to loop lm() models for variable i (response) with an explanatory variable in a list of dataframes that are split by factor.我想在按因子拆分的数据帧列表中使用解释变量循环变量i (响应)的lm()模型。 Lastly, I want to create two dataframes that will show the lm coefficients: the first will show the slope and the second the p.value with response variables tested in the models as cols and factor levels in rows.最后,我想创建两个显示lm系数的数据框:第一个显示slope ,第二个显示p.value ,其中在模型中测试的响应变量作为 cols 和行中的因子水平。

I managed to run and print the output of the summary of the lm models, but not sure how to create the appropriate slope and p.value dataframes.我设法运行并打印了lm模型summary的 output,但不确定如何创建适当的slopep.value数据帧。

Here is what I've done:这是我所做的:

data (iris)
iris_split = split (iris,f=iris$Species) ### Split the data by factor "Species"

I want to run lm models for each of the following variables (treated as responses for the sake of the question) with Petal.Width我想用Petal.Width为以下每个变量运行 lm 模型(出于问题的考虑被视为响应)

vars = as.vector (unique (colnames (subset (iris, select = -c(Species, Petal.Width )))))
#Output:
#> vars
#[1] "Sepal.Length" "Sepal.Width"  "Petal.Length"
iris_lm = for (i in vars) { # loop across vars
  lm_summary = lapply (iris_split, FUN = function(x) 
                summary(lm (x[,i] ~ x[,"Petal.Width"]))) #Where (x) is levels of factors "Species"
                print(i) # so I could see which variable is tested in the model
                print(lm_summary)
}

How do I create the slop.df and p.val.df ?如何创建slop.dfp.val.df They need to look like this:他们需要看起来像这样:

#> slop.df
#     Species Sepal.Length Sepal.Width Petal.Length
#1     setosa       slope?      slope?       slope?
#2 versicolor       slope?      slope?       slope?
#3  virginica       slope?      slope?       slope?

The actual slopes need to be shown instead of the "slope?"需要显示实际斜率而不是"slope?" placeholder, and the same goes for p.val.df占位符,同样适用于p.val.df

packages from the [tidyverse][1] make this fairly convenient:来自 [tidyverse][1] 的包使这相当方便:

iris %>% 
    pivot_longer(-c(Species, Petal.Width),
                 names_to = 'variable',
                 values_to = 'value'
                 ) %>% 
    group_by(Species, variable) %>% 
    ## mind to return the model results as a list!
    summarise(model_summary = list(summary(lm(Petal.Width ~ value)))) %>% 
    rowwise %>%
    mutate(slope = model_summary$coefficients[2, 'Estimate'],
           ## p = model_summary$coefficients[2, 'Pr(>|t|)']
           ) %>%
    ungroup %>%
    pivot_wider(id_cols = Species,
                names_from = 'variable',
                values_from = 'slope')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM