简体   繁体   中英

What's the difference between these two statements (R / dplyr)

I have a question about dplyr. When given the data frame my_data

library(dplyr)
set.seed(20160229)
my_data = data.frame(
  y=c(rnorm(1000), rnorm(1000, 0.5), rnorm(1000, 1), rnorm(1000, 1.5)),
  x=c(rep('a', 2000), rep('b', 2000)),
  m=c(rep('i', 1000), rep('j', 2000), rep('i', 1000)))

case 1:

pdat <- my_data %>%
  group_by(x, m) %>%
  do(data.frame(loc = density(.$y)$x,
                dens = density(.$y)$y))

and case 2:

 pdat <- my_data
pdat  <- group_by(my_data, x, m)
do(data.frame(pdat,loc=density(pdat$y)$x),dens=density(pdat$y)$y)

Why are these statements different? How can case 2 be changed to match case 1?

Your call to do is missing the .data argument. You need to either pipe it in, as in your "case 1," or provide it explicitly. Try something like:

do(.data = pdat, data.frame(loc = density(.$y)$x, dens = density(.$y)$y))

And now they match:

my_data %>%
group_by(x, m) %>%
do(data.frame(loc = density(.$y)$x,
            dens = density(.$y)$y)) -> a

b <- do(.data= pdat, data.frame(loc = density(.$y)$x, dens = density(.$y)$y))

identical(a,b)  # TRUE

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM