简体   繁体   中英

Using lapply with multiple function inputs without nesting

I have a data frame with four columns: two columns indicate participation in a sport, while the other two columns indicate whether the player passed each of their two fitness exams.

dat <- data.frame(SOCCER = sample(0:1, 10, replace = T),
                  BASEBALL = sample(0:1, 10, replace = T),
                  TEST_1_PASS = sample(0:1, 10, replace = T),
                  TEST_2_PASS = sample(0:1, 10, replace = T))

I would like to obtain a list containing contingency tables for each sport and exam. I know that I can accomplish this using the following code, which uses nested lapply statements, but this strikes me as inefficient. Can anyone propose a more elegant solution that doesn't use nesting?

results <- lapply(c("SOCCER", "BASEBALL"), function(x) {
  lapply(c("TEST_1_PASS", "TEST_2_PASS"), function(y){
    table(sport = dat[[x]], pass = dat[[y]])
  })
})

Thanks as always!

The double lapply gets all pairwise combinations of the columns in each of the columns' vectors, like @Gregor wrote in a comment

I don't think your nesting is inefficient... you need to call table 4 times. It doesn't really matter if that's inside one loop/lapply over 4 items or 2 nested loops/lapplys with 2 items each.

But here is another way, with one of the loops in disguise as expand.grid .

cols <- expand.grid(x = c("SOCCER", "BASEBALL"), 
                    y = c("TEST_1_PASS", "TEST_2_PASS"),
                    stringsAsFactors = FALSE)
Map(function(.x, .y)table(dat[[.x]], dat[[.y]]), cols$x, cols$y)

Consider re-formatting your data into long format (ie, tidy data ), merging the two data pieces of sport and exam , then run by (a rarely used member of apply family as object-oriented wrapper to tapply ) for all subset combinations between the two returning a named header report of results:

# RESHAPE EACH DATA SECTION (SPORT AND EXAM) INTO LONG FORMAT
df_list <- lapply(list(c("SOCCER", "BASEBALL"), c("TEST_1_PASS", "TEST_2_PASS")), function(cols)
                   reshape(cbind(PLAYER = row.names(dat), dat[cols]), 
                           varying = cols, v.names = "VALUE", 
                           times = cols, timevar = "INDICATOR", 
                           idvar = "PLAYER", ids = NULL,
                           new.row.names = 1:1E4, direction = "long")
           )

# CROSS JOIN (ALL COMBINATION PAIRINGS)
final_df <- Reduce(function(x,y) merge(x, y, by="PLAYER", suffixes=c("_SPORT", "_EXAM")), df_list)
final_df

# RUN TABLES FOR EACH SUBSET COMBINATION
tables_list <- with(final_df, by(final_df, list(INDICATOR_SPORT, INDICATOR_EXAM), function(sub)
                                  table(sport = sub$VALUE_SPORT, pass = sub$VALUE_EXAM)
                                )
                   )    

Output

tables_list

# : BASEBALL
# : TEST_1_PASS
#      pass
# sport 0 1
#     0 3 4
#     1 2 1
# ------------------------------------------------------------ 
# : SOCCER
# : TEST_1_PASS
#      pass
# sport 0 1
#     0 2 0
#     1 3 5
# ------------------------------------------------------------ 
# : BASEBALL
# : TEST_2_PASS
#      pass
# sport 0 1
#     0 3 4
#     1 1 2
# ------------------------------------------------------------ 
# : SOCCER
# : TEST_2_PASS
#      pass
# sport 0 1
#     0 2 0
#     1 2 6

Online Demo

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM