Have a csv file with the columns ABCDE
Created a fun1 function like this to summarize data
fun1 <- function(x){c(len=length(x), min=min(x), max=max(x))}
When I summarize on a particular column, it works
summaryBy(A ~ B, data=data1, FUN=fun1 , keep.names=TRUE)
But, How do I add a additional function in fun1
like sum(C)
(Which is not relevant to x) and use it in summaryBy
to get the relevant results for a groupBy of B?
For example,
A B C D E
1 2 3 4 5
1 2 4 5 7
1 3 5 7 8
Need to group by B (with respect to A), so will get two groups (2,3). But sum(c) will
be irrespective of A.
Results should be
B len min max sum(c)
2 2 1 1 7
3 1 1 1 5
Try this:
summaryBy(A + C ~ B, data = data1, FUN = c(length, min, max, sum))[c(-3, -5, -7, -8)]
giving:
B A.length A.min A.max C.sum
1 2 2 1 1 7
2 3 1 1 1 5
summaryBy
might not be the best fit for that problem. With sqldf it could be written like this:
library(sqldf)
sqldf("select B, count(A) len, min(A) min, max(A) max, sum(C) sum from data1 group by B")
giving:
B len min max sum
1 2 2 1 1 7
2 3 1 1 1 5
Note: In the examples above we used;
data1 <- structure(list(A = c(1L, 1L, 1L), B = c(2L, 2L, 3L), C = 3:5,
D = c(4L, 5L, 7L), E = c(5L, 7L, 8L)), .Names = c("A", "B",
"C", "D", "E"), class = "data.frame", row.names = c(NA, -3L))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.