I have one column with a response variable and many columns with independent variables. Every Independent variable is just a binary 0 or 1, and I want to loop through each column to calculate the response variable mean of the 1's and the response variable mean of the 0's so that I can run a T test. I am new to R and don't know how to set the response variable column aside or how to assign all of the other columns to a variable.
You can use lapply
to solve your problem:
preds <- a vector of all predictors
response <- "the_response"
lapply(preds, function(x) t.test(reformulate(x, response), data = your_data))
Example:
dat
y x1 x2 x3 x4 x5 x6 x7 x8 x9 x10
1 5.1 1 1 1 1 1 1 1 1 1 1
2 4.9 1 1 1 1 0 1 1 1 1 1
3 4.7 0 0 0 0 1 0 0 0 0 0
4 4.6 0 1 1 0 0 0 1 1 0 0
5 5.0 0 1 1 1 1 1 0 0 1 1
6 5.4 1 1 1 1 1 1 1 1 1 1
7 4.6 1 0 1 1 0 0 1 0 1 0
8 5.0 0 0 0 1 0 0 0 0 0 0
9 4.4 0 0 0 0 0 0 0 0 1 0
10 4.9 0 1 0 0 0 0 1 1 0 0
11 5.4 1 1 1 1 1 1 1 1 0 1
12 4.8 1 1 1 1 1 1 1 1 1 1
13 4.8 0 1 0 0 0 0 0 0 1 0
14 4.3 1 1 1 1 1 1 1 1 1 1
15 5.8 0 0 1 1 1 0 1 1 1 0
preds <- names(dat)[-1]
response <- "y"
lapply(preds, function(x) t.test(reformulate(x, response), data = dat, var.equal = TRUE))
[[1]]
Two Sample t-test
data: y by x1
t = -0.13376, df = 13, p-value = 0.8956
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-0.4900214 0.4328786
sample estimates:
mean in group 0 mean in group 1
4.900000 4.928571
[[2]]
Two Sample t-test
data: y by x2
t = -0.088442, df = 13, p-value = 0.9309
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-0.5085418 0.4685418
sample estimates:
mean in group 0 mean in group 1
4.90 4.92
:
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.