Creating a new variable that contains conditional rowsums in R

Question

I have a dataframe with 12 variables:

id_group1, id_group2, ..., id_group11 : 11 variables with a numeric value

mean_id : mean over all the above mentioned id_group variables

What I would need now is a new variable that contains the rowsum only for id_group variables whose value is LARGER THAN mean_id.

I am new to R and am still struggling with seemingly simple operations - so far I have tried using ifelse constructions but it never seemed to work.

Does anyone have an idea how to go about this?

Answer 1

Here is one option with apply . Loop over the rows (assuming that the last column is 'mean_id', subset the other elements that are greater than the 12th and get the sum

apply(df1, 1, function(x) sum(x[-12][x[-12] > x[12]], na.rm = TRUE))
#[1] 42 40 52 39 50 51 49 49 24 27

or with rowSums , we replace the elements in the columns other than 12th, where thee value is less than or equal to mean column and get the rowSums

rowSums(replace(df1[-12], df1[-12] <= df1[,12], NA), na.rm = TRUE)
#[1] 42 40 52 39 50 51 49 49 24 27

data

set.seed(24)
df1 <- as.data.frame(matrix(sample(1:8, 11 * 10, replace = TRUE), 
     ncol = 11, dimnames = list(NULL, paste0("id_group", 1:11))))
df1$mean_id <- sample(1:6, 10, replace = TRUE)

Creating a new variable that contains conditional rowsums in R

Question

1 answers

solution1
0 ACCPTED 2019-12-03 16:04:11

data

Creating a new variable that contains conditional rowsums in R

Question

1 answers

solution1 0 ACCPTED 2019-12-03 16:04:11

data

solution1
0 ACCPTED 2019-12-03 16:04:11