I'm rather new to r and have a question that seems pretty straight-forward. I want to do rowSums but to only include in the sum values within a specific range (eg, higher than 0).
eg - with the last column being the requested sum
col1 col2 col3 col4 totyearly
1 -5 3 4 NA 7
2 1 40 -17 -3 41
3 NA NA -2 -5 0
4 NA 1 1 1 3
What I currently have is:
df$totyearly <- rowSums(df[, 1:4], na.rm=TRUE)
How do I add the condition re positive values?
We can use replace
to replace the values less than 0 to 0 and then take rowSums
.
df$totyearly <- rowSums(replace(df, df < 0, 0), na.rm = TRUE)
df
# col1 col2 col3 col4 totyearly
#1 -5 3 4 NA 7
#2 1 40 -17 -3 41
#3 NA NA -2 -5 0
#4 NA 1 1 1 3
You could write your own custom sum function and apply
it to each row:
df <- read.table(text = "
col1 col2 col3 col4 totyearly
1 -5 3 4 NA 7
2 1 40 -17 -3 41
3 NA NA -2 -5 0
4 NA 1 1 1 3",
header = TRUE)
#define custom sum function
sum.pos <- function(x) sum(x[x > 0], na.rm = TRUE)
#apply it to each row
df$totyearly <- apply(df[ , 1:4], 1, sum.pos)
#or equivalently
df$totyearly <- apply(df[ , 1:4], 1, function(x) sum(x[x > 0], na.rm = TRUE))
Multiply by a logical check and then sum:
rowSums(df * (df >= 0), na.rm=TRUE)
# 1 2 3 4
# 7 41 0 3
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.