I have a dataframe A with n columns. I need to find max value (but not 0 when other values are less than 0) per column by factor B.
dataframe(A)
B a b
1 0 0
2 0 0
3 0 0
1 -0.1 0.1
2 0.2-0.3
3 0 1
1 -0.3 0.4
2 -0.5 0.2
3 0.1 0.2
The output im looking for looks like this
B a b
1 -0.3 0.4
2 0.2 0.2
3 0.1 1
I know that i can use aggregate
function but it only works for one column at a time.
The algorithm for each column is:
1. neglect all 0
2. if all values<0 then take the min of the values, else take the max
Here is a solution with base R:
f1 <- function(x) { x1 <- x[x!=0]; if(all(x1<0)) min(x1) else max(x1) }
aggregate(cbind(a,b) ~ B, data=A, FUN=f1)
(The function f1()
is taken from the answer of @akrun)
result:
#> aggregate(cbind(a,b) ~ B, data=A, FUN=f1)
# B a b
#1 1 -0.3 0.4
#2 2 0.2 0.2
#3 3 0.1 1.0
data:
A <- structure(list(B = c(1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L), a = c(0,
0, 0, -0.1, 0.2, 0, -0.3, -0.5, 0.1), b = c(0, 0, 0, 0.1, -0.3,
1, 0.4, 0.2, 0.2)), .Names = c("B", "a", "b"), class = "data.frame",
row.names = c(NA, -9L))
We can try with data.table
library(data.table)
f1 <- function(x) {x1 <- x[x!=0];
if(all(x1<0)) min(x1) else max(x1)}
setDT(A)[, lapply(.SD, f1), by = B]
# B a b
#1: 1 -0.3 0.4
#2: 2 0.2 0.2
#3: 3 0.1 1.0
Or with dplyr
library(dplyr)
A %>%
group_by(B) %>%
summarise_each(funs(f1))
# A tibble: 3 × 3
# B a b
# <int> <dbl> <dbl>
#1 1 -0.3 0.4
#2 2 0.2 0.2
#3 3 0.1 1.0
A <- structure(list(B = c(1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L), a = c(0,
0, 0, -0.1, 0.2, 0, -0.3, -0.5, 0.1), b = c(0, 0, 0, 0.1, -0.3,
1, 0.4, 0.2, 0.2)), .Names = c("B", "a", "b"), class = "data.frame",
row.names = c(NA, -9L))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.