简体   繁体   中英

aggregate data by 5min excluding max and min

I have a data-frame likeso:

Time    <- seq.POSIXt(as.POSIXct("2017-11-14 00:01:00 CET"), as.POSIXct("2017-11-14 00:15:00 CET"), units = "minute", by=60)
A <- c(2,3,5,2,5,8,17,3,5,8,17,3,5,1,5)
B <- c(1,1,2,1,2,1,2,2,2,4,6,7,8,8,9)

DF <- data.frame(Time=Time, A=A, B=B)

and i want a "newDF" where I aggregate data by 5min, excluding however, for each column, the max/min value before the aggregation.

Using dplyr i get to something like this:

DF$TimeStamp_round<-floor_date(DF$Time,unit="5 minutes")
DF<-DF %>%
  group_by(TimeStamp_round) %>%
  mutate(TimeStamp_count = cur_group_id())

newDF<-DF %>%
  group_by(TimeStamp_count) %>%
  summarise(across(where(is.numeric), mean))

but i still don´t manage to exclude the max/min value before the summarise() function in newDF

note: I do not want to do it manually for each column, because in the real DF the columns are 350

We can remove the range of values before taking the mean after grouping by 'TimeStamp_round'

library(dplyr)
DF %>%
     group_by(TimeStamp_round) %>% 
     summarise(across(A:B, ~ mean(.[!. %in% range(.)])), .groups = 'drop')

Or if there are more columns and want to get the mean only for numeric

DF %>%
    select(-Time) %>%
    group_by(TimeStamp_round) %>% 
    summarise(across(where(is.numeric), 
           ~ mean(.[!. %in% range(.)])), .groups = 'drop')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM