简体   繁体   中英

filter by quantile at a character vector

I have character vector from which I want to filter out the 95th quantile.

If use the following command it will change my data frame (ie only n and name remains).

  mydf %>% 
  count(name) %>%
  filter(n > quantile(n, 0.95))

If I use this command I get an error.

  mydf %>% 
  group_by(name) %>%
  filter(name > quantile(name, 0.95))

  Error in filter_impl(.data, quo) : Evaluation error: non-numeric argument 
  to binary operator.

Here is a small dupt

structure(list(name = c("Panda Express", "Noodles & Company", 
"Panda Express", "Panda Express", "Panda Express", "Panda Express", 
"Panda Express", "Noodles & Company", "Noodles & Company", "China"
), postal_code = c("85301", "85382", "89122", "89134", "85296", 
"85042", "89012", "15241", "85236", "85018")), .Names = c("name", 
"postal_code"), row.names = c(NA, 10L), class = "data.frame"))

We can use semi_join after the filter

library(dplyr)
df %>% 
  count(name) %>% 
  filter(n > quantile(n, 0.95)) %>%
  semi_join(df, ., by = 'name')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM