What distinguishes dplyr::pull from purrr::pluck and magrittr::extract2?

Question

In the past, when working with a data frame and wanting to get a single column as a vector, I would use magrittr::extract2() like this:

mtcars %>%
  mutate(wt_to_hp = wt/hp) %>%
  extract2('wt_to_hp')

But I've seen that dplyr::pull() and purrr::pluck() also exists to do much the same job: return a single vector from a data frame, not unlike [[ .

Assuming that I'm always loading all 3 libraries for any project I work on, what are the advantages and use cases of each of these 3 functions? Or more specifically, what distinguishes them from each other?

Answer 1

When you "should" use a function is really a matter of personal preference. Which function expresses your intention most clearly. There are differences between them. For example, pluck works better when you want to do multiple extractions. From help file:

 accessor(x[[1]])$foo 
 # is the same as
 pluck(x, 1, accessor, "foo")

so while it can be use to just extract a column, it's useful when you have more deeply nested structures or you want to compose with an accessor function.

The pull function is meant to blend in with the result of the dplyr function. It can take the name of a column using any of the ways you can with other functions in the package. For example it will work with !! style expansion where say extract2 will not.

irispull <- function(x) {
  iris %>% pull(!!enquo(x))
}
irispull(Sepal.Length)

And extract2 is nothing more than a "more readable" wrapper for the base function [[ . In fact it's defined as .Primitive("[[") so it expects column names as character or column indexes and integers.

What distinguishes dplyr::pull from purrr::pluck and magrittr::extract2?

Question

1 answers

solution1
10 ACCPTED 2019-01-09 16:24:26

What distinguishes dplyr::pull from purrr::pluck and magrittr::extract2?

Question

1 answers

solution1 10 ACCPTED 2019-01-09 16:24:26

solution1
10 ACCPTED 2019-01-09 16:24:26