简体   繁体   中英

Creating a new variable that contains conditional rowsums in R

I have a dataframe with 12 variables:

id_group1, id_group2, ..., id_group11 : 11 variables with a numeric value

mean_id : mean over all the above mentioned id_group variables

What I would need now is a new variable that contains the rowsum only for id_group variables whose value is LARGER THAN mean_id.

I am new to R and am still struggling with seemingly simple operations - so far I have tried using ifelse constructions but it never seemed to work.

Does anyone have an idea how to go about this?

Here is one option with apply . Loop over the rows (assuming that the last column is 'mean_id', subset the other elements that are greater than the 12th and get the sum

apply(df1, 1, function(x) sum(x[-12][x[-12] > x[12]], na.rm = TRUE))
#[1] 42 40 52 39 50 51 49 49 24 27

or with rowSums , we replace the elements in the columns other than 12th, where thee value is less than or equal to mean column and get the rowSums

rowSums(replace(df1[-12], df1[-12] <= df1[,12], NA), na.rm = TRUE)
#[1] 42 40 52 39 50 51 49 49 24 27

data

set.seed(24)
df1 <- as.data.frame(matrix(sample(1:8, 11 * 10, replace = TRUE), 
     ncol = 11, dimnames = list(NULL, paste0("id_group", 1:11))))
df1$mean_id <- sample(1:6, 10, replace = TRUE)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM