Dplyr select(contains()) with dynamic variable

Question

I have a DF, which is updated every quarter with new data, ie it gets wider every couple of months. In a next step, I would like to calculate the difference of values between the first and the latest observation. Using contains() I would like to select the first and last observation dynamically, avoiding writing the names of the variables again after every update.

In other cases, I have been using !!sym() for mutate() like in the following example, which is working fine:

df <- df %>%
  mutate(var1 = ifelse(!!sym(first_year) == 0, 1, 0))

But when I try, the following, I get an error ( first_year equals 2008 in this case):

df <- df %>%
  select(contains(!!sym(first_year)))

Error: object '2008' not found

Ist there a way to use dynamic variables in select(contains()) or select(matches()) - or is this not possible?

Thanks for any help!

Answer 1

The documentation of ?contains states that the first argument should be a character vector:

 match A character vector. If length > 1, the union of the matches is taken.

Therefore, you don't have to use any tidy evaluation function such as sym() :

library(dplyr)
x="Spe"
iris %>% select(contains(x)) %>% head()
#>   Species
#> 1  setosa
#> 2  setosa
#> 3  setosa
#> 4  setosa
#> 5  setosa
#> 6  setosa

^{Created on 2021-03-15 by the reprex package (v1.0.0)}

However, we have very little information about what you are working with (what do first_year and df look like?); this answer might be incorrect because of that.

Dplyr select(contains()) with dynamic variable

Question

1 answers

solution1
1 ACCPTED 2021-03-15 12:17:35

Dplyr select(contains()) with dynamic variable

Question

1 answers

solution1 1 ACCPTED 2021-03-15 12:17:35

solution1
1 ACCPTED 2021-03-15 12:17:35