简体   繁体   中英

How do I compare group means to individual observations and make a new TRUE/FALSE column?

I am new to R and this is my first post on SO - so please bear with me.

I am trying to identify outliers in my dataset. I have two data.frames:

(1 - original data set, 192 rows): observations and their value (AvgConc)

(2 - created with dplyr, 24 rows): Group averages from the original data set, along with quantiles, minimum, and maximum values

I want to create a new column within the original data set that gives TRUE/FALSE based on whether (AvgConc) is greater than the maximum or less than the minimum I have calculated in the second data.frame. How do I go about doing this?

Failed attempt:

Outliers <- Original.Data %>%
 group_by(Status, Stim, Treatment) %>%
 mutate(Outlier = Original.Data$AvgConc > Quantiles.Data$Maximum | Original.Data$AvgConc <  Quantiles.Data$Minimum) %>%
 as.data.frame()

Error: Column Outlier must be length 8 (the group size) or one, not 192

Here, we need to remove the Quantiles.Data$ by doing a join with 'Original.Data' by the 'Status', 'Stim', 'Treatment'

library(dplyr)
Original.Data %>%
   inner_join(Quantiles.Data %>% 
              select(Status, Stim, Treatment, Maximum, Minimum)) %>%
   group_by(Status, Stim, Treatment) %>%
   mutate(Outlier = (AvgConc > Maximum) |(AvgConc <  Minimum)) %>%
   as.data.frame()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM