简体   繁体   中英

Mapping pipes to multiple columns in tidyverse

I'm working with a table for which I need to count the number of rows satisfying some criterion and I ended up with basically multiple repetitions of the same pipe differing only in the variable name.

Say I want to know how many cars are better than Valiant in mtcars on each of the variables there. An example of the code with two variables is below:

library(tidyverse)

reference <- mtcars %>% 
     slice(6)

mpg <- mtcars  %>% 
  filter(mpg > reference$mpg) %>%
  count() %>% 
  pull()

cyl <- mtcars  %>% 
  filter(cyl > reference$cyl) %>%
  count() %>% 
  pull()

tibble(mpg, cyl)

Except, suppose I need to do it for like 100 variables so there must be a more optimal way to just repeat the process.

What would be the way to rewrite the code above in an optimal way (maybe, using map() or anything else that works with pipes nicely so that the result would be a tibble with the counts for all the variables in mtcars ?

I feel the solution should be very easy but I'm stuck. Thank you!

You could use summarise + across to count observations greater than a certain value in each column.

library(dplyr)

mtcars %>%
  summarise(across(everything(), ~ sum(. > .[6])))

#   mpg cyl disp hp drat wt qsec vs am gear carb
# 1  18  14   15 22   30 11    1  0 13   17   25

  • base solution:
# (1)
colSums(mtcars > mtcars[rep(6, nrow(mtcars)), ])

# (2)
colSums(sweep(as.matrix(mtcars), 2, mtcars[6, ], ">"))

# mpg  cyl disp   hp drat   wt qsec   vs   am gear carb
#  18   14   15   22   30   11    1    0   13   17   25

Or:

library(tidyverse)

map_dfc(mtcars, ~sum(.x[6] < .x))

map2_dfc(mtcars, reference, ~sum(.y < .x))

You can do it in a loop for example. Like this:

library(tidyverse)

reference <- mtcars %>% 
  slice(6)

# Empty list to save outcome
list_outcome <- list()

# Get the columnnames to loop over
loop_var <- colnames(reference)
for(i in loop_var){
  nr <- mtcars  %>% 
    filter(mtcars[, i] > reference[, i]) %>%
    count() %>% 
    pull()
  # Save every iteration in the loop as the ith element of the list
  list_outcome[[i]] <- data.frame(Variable = i, Value = nr)
}

# combine all the data frames in the list to one final data frame
df_result <- do.call(rbind, list_outcome)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM