简体   繁体   English

如何删除数据框中的“最小值”和“最大值”值并计算“R”中的平均值

[英]How to remove the 'minimum' and 'maximum' values in a data frame and compute the average in 'R'

数据

I have a data like this.我有这样的数据。 There are six entries for the code 5005132#2000, in which '0' is the minimum and maximum is '22' and for the code 5008568#2000, there are 7 entries, in which '0' is the minimum and '11' is the maximum.代码 5005132#2000 有 6 个条目,其中“0”是最小值,最大值是“22”;对于代码 5008568#2000,有 7 个条目,其中“0”是最小值,“11”是最大值。 I have to remove these minimum and maximum values related to the particular codes and compute the 'Average' for the particular code.我必须删除与特定代码相关的这些最小值和最大值,并计算特定代码的“平均值”。

The avg of 5005132#2000 should be 7.75 and The avg of 5008568#2000 should be 7.8 5005132#2000 的平均值应为 7.75,5008568#2000 的平均值应为 7.8

One solution is to use data.table.一种解决方案是使用 data.table。 The data.table is like a data.frame but with added fuctionality. data.table 就像一个 data.frame 但增加了功能。 You will first need to load the data.table package and convert your data.frame ( df ) to a data.table您首先需要加载 data.table package 并将您的 data.frame ( df ) 转换为 data.table

library(data.table)
setDT(df)

From there, filter out the values at the extremes for each group using by , then get the mean of the remaining values.从那里,使用by过滤掉每个组的极端值,然后得到剩余值的平均值。

# Solution: 
df[, 
    # ID rows where value is min/max
    .(Cycle.Time, "drop" = Cycle.Time %in% range(Cycle.Time)), by=Code][
    # Filter to those where value is not mon/max and get mean per Code
    drop==FALSE, mean(Cycle.Time), by=Code]

An alternative is to use dplyr另一种方法是使用 dplyr

df %>% 
  group_by(Code) %>% 
  filter(!Cycle.Time %in% range(Cycle.Time)) %>% 
  summarize(mean(Cycle.Time))

And to store that in a data.frame:并将其存储在 data.frame 中:

df %>% 
  group_by(Code) %>% 
  filter(!Cycle.Time %in% range(Cycle.Time)) %>% 
  summarize(mean(Cycle.Time)) %>% 
  data.frame -> averages

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM