简体   繁体   中英

R dplyr function putting mutate, top_frac and ifelse together

Im looking for ways to mutate a new column to assign the top and bottom 20% of values using dplyr.

Here is my code and it isnt working well for me.

DF1 <- DF %>%
  group_by(Timepoint) %>%
  filter (!is.na (log2_Concentration)) %>%
  arrange (desc(log2_Concentration)) %>%
  mutate (top_bottom=ifelse (log2_Concentration=top_frac(.2), "TOP20PERC",
          ifelse (log2_Concentration=top_frac(-.2), "BOTTOM20PERC", "MID")))

ggplot(DF1, aes(x = Timepoint, y=log2_Concentration,fill=Timepoint)) + 
  geom_boxplot() +
  geom_jitter(size=1,position=position_jitter(0.2), aes(col=DF1$top_bottom)) +
  scale_colour_manual(values = c("red", "gray", "blue"), 
                      labels = c("TOP20PERC", "MID", "BOTTOM20PERC"))

My hope is to assign per timepoint, the top 20%, bottom 20% and the rest as MID so I can either color these points in my ggplot.

[ ggplot 应该看起来像每个时间点都有顶部、中部和底部 [1]

Thanks a lot gurus!

Probably, you can use quantile to get top and bottom 20%.

library(dplyr)

DF %>%
  filter(!is.na(log2_Concentration)) %>%
  group_by(Timepoint) %>%
  mutate(top_bottom= case_when(
           log2_Concentration > quantile(log2_Concentration, 0.8) ~"TOP20PERC", 
           log2_Concentration < quantile(log2_Concentration, 0.2) ~"BOTTOM20PERC", 
           TRUE ~ "MID"))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM