How can I get data attributes from rlang's .data like I can with .?

Question

I am building a tidy-compatible function for use inside dplyr 's mutate where I'd like to pass a variable and also the data set I'm working with, and use information from both to build a vector.

As a basic example, imagine I want to return a string containing the mean of the variable and the number of rows in the data set (I know I could just take the length of var , ignore that, it's an example).

library(tidyverse)
library(rlang)

info <- function(var,df = get(".",envir = parent.frame())) {
  paste(mean(var),nrow(df),sep=', ')
}

dat <- data.frame(a = 1:10, i = c(rep(1,5),rep(2,5)))

#Works fine, 'types' contains '5.5, 10'
dat %>% mutate(types = info(a))

Ok, great so far. But now maybe I want it to work with grouped data. var will be from just one group, but . would be the full data set. So instead I'll use rlang 's .data pronoun, which is just the data being worked with.

However, .data is not like . . . is the data set, but .data is just a pronoun from which I can pull variables with .data[[varname]] .

info2 <- function(var,df = get(".data",envir = parent.frame())) {
  paste(mean(var),nrow(.data),sep=', ')
}

#Doesn't work. nrow(.data) gives blank strings
dat %>% group_by(i) %>% mutate(types = info2(a))

How can I get the full thing from .data ? I know I didn't include it in the example but specifically I both need some stuff from attr(dat) AND some stuff from the variables in dat that is properly subsetted for the grouping, so neither reverting to . nor just pulling out variables and getting stuff from there would work.

Answer 1

As Alexis mentioned in the above comment, this is not possible, as it's not the intended use of .data . However, now that I've given up on doing this directly, I've worked up a kludge using a combination of . and .data .

info <- function(var,df = get(".",envir = parent.frame())) {
  #First, get any information you need from .
  fulldatasize <- nrow(df)

  #Then, check if you actually need .data,
  #i.e. the data is grouped and you need a subsample
  if (length(var) < nrow(df)) {
      #If you are, get the list of variables you want from .data, maybe all of them
      namesiwant <- names(df)

      #Get .data
      datapronoun <- get('.data',envir=parent.frame())

      #And remake df using just the subsample
      df <- data.frame(lapply(namesiwant, function(x) datapronoun[[x]]))
      names(df) <- namesiwant
  }

  #Now do whatever you want with the .data data
  groupsize <- nrow(df)

  paste(mean(var),groupsize,fulldatasize,sep=', ')
}

dat <- data.frame(a = 1:10, i = c(rep(1,5),rep(2,5)))

#types contains the within-group mean, then 5, then 10
dat %>% group_by(i) %>% mutate(types = info(a))

Answer 2

Why not use length() instead of nrow() here ?

dat <- data.frame(a = 1:10, i = c(rep(1,5),rep(2,5)))

info <- function(var) {
  paste(mean(var),length(var),sep=', ')
}
dat %>% group_by(i) %>% mutate(types = info(a))
#> # A tibble: 10 x 3
#> # Groups:   i [2]
#>        a     i types
#>    <int> <dbl> <chr>
#>  1     1     1 3, 5 
#>  2     2     1 3, 5 
#>  3     3     1 3, 5 
#>  4     4     1 3, 5 
#>  5     5     1 3, 5 
#>  6     6     2 8, 5 
#>  7     7     2 8, 5 
#>  8     8     2 8, 5 
#>  9     9     2 8, 5 
#> 10    10     2 8, 5

How can I get data attributes from rlang's .data like I can with .?

Question

2 answers

solution1
0 ACCPTED 2019-08-04 19:53:56

solution2
0 2019-08-08 16:01:06

How can I get data attributes from rlang's .data like I can with .?

Question

2 answers

solution1 0 ACCPTED 2019-08-04 19:53:56

solution2 0 2019-08-08 16:01:06

solution1
0 ACCPTED 2019-08-04 19:53:56

solution2
0 2019-08-08 16:01:06