简体   繁体   中英

Calculate min, maximum and mean in R

I have data set with 130 rows and two columns. I want to calculate the mean, minimum and maximum of every 5 rows of the seconds column using R. By using colMeans and the following command rep(colMeans(matrix(data$Pb, nrow=5), na.rm=TRUE), each=5) I was able to compute mean for every 5 rows. However i am not able to compute max and min since there is no built in function for the same. I tried as suggested here for 5 rows instead of 2. However I get an error that dim(X) must have a positive length. Can someone please help me understand as to what should I do to fix and compute the above quantities ? My end goal is to plot min,mean, max for every 5 rows.

Thanks in advance.

If we are looking for function to find the max and min of each column of matrix , colMaxs and colMins from matrixStats can be used.

library(matrixStats)
colMaxs(mat)
#[1]  7  8 20

colMins(mat)
#[1] 3 1 7

But, if this is to find for every 5 rows of dataset columns, use gl to create the grouping index for each 5 rows, and then with the help of by we get the colMaxs or colMins or colMeans

by(data, list(gr=as.numeric(gl(nrow(data), 5, nrow(data)))), 
                 FUN = function(x) colMaxs(as.matrix(x)))

The same way, we can find the colMins or colMeans

by(data, list(gr=as.numeric(gl(nrow(data), 5, nrow(data)))),
             FUN = function(x) colMins(as.matrix(x)))

by(data, list(gr=as.numeric(gl(nrow(data), 5, nrow(data)))),
             FUN = function(x) colMeans(as.matrix(x)))

The above can be done in a compact way with dplyr

 library(dplyr)
 data %>%
    group_by(gr = as.numeric(gl(nrow(.), 5, nrow(.)))) %>%
    summarise_each(funs(min, max, mean))

To do the plot ting, may be we can extend this with ggplot

library(ggplot2)
library(tidyr)
data %>% 
    group_by(gr = as.numeric(gl(nrow(.), 5, nrow(.)))) %>%
    summarise_each(funs(min, max, mean)) %>%
    gather(Var, Val, -gr) %>% 
    separate(Var, into = c("Var1", "Var2")) %>%
    ggplot(., aes(x=factor(gr), y=Val, fill=Var2)) + 
           geom_bar(stat="identity")+
           facet_wrap(~Var1)

data

mat <- matrix(c(3,1,20,5,4,12,6,2,9,7,8,7), byrow=T, ncol=3) 
set.seed(24)
data <- data.frame(Pb = sample(1:9, 42, replace=TRUE), Ps = rnorm(42))

A nice function for this would be the base by function combined with apply . Below is an example where you first make a index of the groups for your function:

m <- matrix(runif(130*2),130,2)
group <- rep(seq(nrow(m)), each=5, length.out=nrow(m))
res <- by(m, INDICES = group, FUN = function(x){apply(x, MARGIN=2, FUN=max)})
class(res) # "by" class
do.call(rbind, res) # matrix

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM