简体   繁体   中英

Using a for loop to use manova in R?

anova_test <- function(dataSet, dataOne, dataTwo){
  for (j in 1:8){
    for (i in 1:4){
      for (k in i:4){
        if(i!=k){
        res <- manova(cbind(colnames(dataOne)[i], colnames(dataOne)[k]) ~ colnames(dataTwo)[j], data = dataSet)
        summary(res.man)
        # Look to see which differ
        summary.aov(res.man)
        }
      }
    }
  }
}

D <- apply_impute(data)
dataOne <- select(D, age, child, balance, previous)
dataTwo <- select(D, job, marital, education, default, housing, loan,
                            contact, month)
anova_test(D, dataOne, dataTwo)

Here is my code. D is a Dataset. In dataOne I put the quantitative variables of D and in dataTwo I put the categorical variables of D. I want to iterate through D to use manova with every pair of quantitative variable with every categorical variable.

But when I run it, I get the following error :

Error in `[[<-.data.frame`(`*tmp*`, i, value = 1:2) : 
  replacement has 2 rows, data has 1
De plus : Warning message:
In storage.mode(v) <- "double" :

 Error in `[[<-.data.frame`(`*tmp*`, i, value = 1:2) : 
  replacement has 2 rows, data has 1

Could you please help me to find what's wrong in my code?

Consider capturing all possible combinations of both sets of column names with expand.grid then call one elementwise loop with Map (wrapper to mapply ) instead of three-level, nested for loops that do not save results to any object.

# BUILD DATA FRAME OF ALL POSSIBLE COMBINATIONS
params_df <- expand.grid(cat1 = c("age", "child", "balance", "previous"),
                         cat2 = c("age", "child", "balance", "previous"),
                         quant = c("job", "marital", "education", "default", 
                                   "housing", "loan", "contact", "month"))

# REMOVE ROWS WHERE CATEGORIES ARE THE SAME
params_df <- subset(params_df, cat1 != cat2)


# USER-DEFINED METHOD TO CALL manova WITH DYNAMIC FORMULA AND RESULTS
anova_test <- function(dataSet, cat1, cat2, quant) {

   frml <- as.formula(paste0("cbind(", cat1, ",", cat2, ") ~ ", quant))
   res.man <- manova(frml, data = dataSet) 

   res.list <- list(estimates = summary(res.man),
                    aov = summary.aov(res.man))

   return(res.list)
}

# RETREIVE DATA
D <- apply_impute(data)

# BUILD LIST OF MANOVA RESULTS
manova_list <- Map(anova_test, 
                   cat1 = params_df$cat1,
                   cat2 = params_df$cat2, 
                   quant = params_df$quant,
                   MoreArgs = list(dataSet = D))

Output

# DISPLAY SELECT RESULTS BY INDEX AND NAMES
manova_list[[1]]$estimates       
manova_list[[1]]$aov

manova_list[[2]]$estimates
manova_list[[2]]$aov
# ...


# DISPLAY ALL RESULTS
lapply(manova_list, `[[`, "estimates")
lapply(manova_list, `[[`, "aov")

First, you don't need to pass the whole data into your anova_test function because you are passing it in two blocks already.

Then in your modelling line you need to supply the actual data not just the column names, and you don't need to specify the dataset because you are already supplying the data.

Eg:

res <- manova(dataOne[,c(i,k)] ~ dataTwo[,j])

You could do this using the column names and the full dataset, but its needlessly more difficult. The difference from your code is the use of get to turn the name as a string into the object it refers to.

 res <- manova(cbind(get(colnames(dataOne)[i]), get(colnames(dataOne)[k])) ~ get(colnames(dataTwo)[j]), data = dataSet)

Finally, I'm not sure why you want so many pairwise MANOVAs like this, there may be a better way to do what you want to do (statistically speaking)..

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM