数据框中的异常值，但我想对 R 中的数据框中的分组行执行此操作

Question

Example dataframe.示例数据框。

I want to detect outliers per group and display it in a separate dataframe, for example, for each species name, anthopleura aureradiata , I want to look at values 27.75, 6.83, and 23.91, and calculate the outliers between these values.我想检测每组的异常值并将其显示在单独的数据框中，例如，对于每个物种名称anthopleura aureradiata ，我想查看值27.75、6.83和 23.91，并计算这些值之间的异常值。 If I find that row 4 is an outlier for that particular species, I want to display it in my new dataframe.如果我发现第 4 行是该特定物种的异常值，我想在我的新数据框中显示它。 Does anyone know how to get about this?有谁知道如何解决这个问题？

Reproducible example:可重现的例子：

x = data.frame("species" = c("Agao", "Beta", "Beta", "Beta", "Carrot", "Carrot"), "sum" = c(1, 100, 5, 4, 3, 0))

Answer 1

We can modify this function based on our requirement and use it to filter outliers for each group and create a new dataframe.我们可以根据我们的要求修改这个函数，并使用它来过滤每个组的异常值并创建一个新的数据框。

library(dplyr)

remove_outliers <- function(x, na.rm = TRUE, ...) {
    qnt <- quantile(x, probs=c(.25, .75), na.rm = na.rm, ...)
    H <- 1.5 * IQR(x, na.rm = na.rm)
    x < (qnt[1] - H) | x > (qnt[2] + H)
}

separate_dataframe <- x %>% group_by(species) %>% filter(remove_outliers(sum))
separate_dataframe

# species   sum
#  <fct>   <dbl>
#1 Beta     -100

data数据

x = data.frame(species = c("Agao", "Beta", "Beta", "Beta", "Beta", 
              "Carrot", "Carrot"),sum = c(1, 1, 5, 4, -100, 3,0))

数据框中的异常值，但我想对 R 中的数据框中的分组行执行此操作

问题描述

1 个解决方案

解决方案1
0 2020-01-24 06:12:03

数据框中的异常值，但我想对 R 中的数据框中的分组行执行此操作

问题描述

1 个解决方案

解决方案1 0 2020-01-24 06:12:03

解决方案1
0 2020-01-24 06:12:03