简体   繁体   English

如何将组均值与单个观察值进行比较并创建新的 TRUE/FALSE 列?

[英]How do I compare group means to individual observations and make a new TRUE/FALSE column?

I am new to R and this is my first post on SO - so please bear with me.我是 R 的新手,这是我在 SO 上的第一篇文章 - 所以请多多包涵。

I am trying to identify outliers in my dataset.我正在尝试识别数据集中的异常值。 I have two data.frames:我有两个data.frames:

(1 - original data set, 192 rows): observations and their value (AvgConc) (1 - 原始数据集,192 行):观察值及其值 (AvgConc)

(2 - created with dplyr, 24 rows): Group averages from the original data set, along with quantiles, minimum, and maximum values (2 - 使用 dplyr 创建,24 行):来自原始数据集的分组平均值,以及分位数、最小值和最大值

I want to create a new column within the original data set that gives TRUE/FALSE based on whether (AvgConc) is greater than the maximum or less than the minimum I have calculated in the second data.frame.我想在原始数据集中创建一个新列,根据 (AvgConc) 是大于最大值还是小于我在第二个 data.frame 中计算的最小值给出 TRUE/FALSE。 How do I go about doing this?我该怎么做呢?

Failed attempt:尝试失败:

Outliers <- Original.Data %>%
 group_by(Status, Stim, Treatment) %>%
 mutate(Outlier = Original.Data$AvgConc > Quantiles.Data$Maximum | Original.Data$AvgConc <  Quantiles.Data$Minimum) %>%
 as.data.frame()

Error: Column Outlier must be length 8 (the group size) or one, not 192错误:列Outlier的长度必须为 8(组大小)或 1,而不是 192

Here, we need to remove the Quantiles.Data$ by doing a join with 'Original.Data' by the 'Status', 'Stim', 'Treatment'在这里,我们需要by “Status”、“Stim”、“Treatment”与“Original.Data”进行连接来删除Quantiles.Data$

library(dplyr)
Original.Data %>%
   inner_join(Quantiles.Data %>% 
              select(Status, Stim, Treatment, Maximum, Minimum)) %>%
   group_by(Status, Stim, Treatment) %>%
   mutate(Outlier = (AvgConc > Maximum) |(AvgConc <  Minimum)) %>%
   as.data.frame()

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何使用 dplyr 将组中的两个观察结果组合成一个新观察结果 - How do I combine two observations in a group into a new observation with dplyr 使用 ggplot2 绘制单个观察值和组均值 - Plotting individual observations and group means with facets with ggplot2 如何基于观察组的另一个变量为观察组创建一个新变量 - How do I create a new variable for a group of observations based on another variable specific to that group 如何将一组观察值与二元组匹配? - How do I match a group of observations with a dyad? 如何将均值推算到列中的特定观察值中? - How to impute means into specific observations in a column? 如何为每一行中的每对观察创建一个新列 - R Language - How do I create a new column for every pair of observations in each row - R Language 在数据表 R 中,如何创建一个新变量,该变量为特定观察值取特定值? - In data table, R, how do I make a new variable that takes a certain value for specific observations? 如何根据另一列中的 TRUE/FALSE 获得一列的总和 - How do I get the sum of one column based on a TRUE/FALSE in another column 如何在将年份设置为单独值的数据集中对十年进行的观察进行分组和计数? - How can I group and count observations made by decade in a dataset that has the years set as individual values? 如何使用新组的总和创建新观察? - How to create new observations with sum of a new group?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM