I have a dataframe with 12 variables:
id_group1, id_group2, ..., id_group11 : 11 variables with a numeric value
mean_id : mean over all the above mentioned id_group variables
What I would need now is a new variable that contains the rowsum only for id_group variables whose value is LARGER THAN mean_id.
I am new to R and am still struggling with seemingly simple operations - so far I have tried using ifelse constructions but it never seemed to work.
Does anyone have an idea how to go about this?
Here is one option with apply
. Loop over the rows (assuming that the last column is 'mean_id', subset the other elements that are greater than the 12th and get the sum
apply(df1, 1, function(x) sum(x[-12][x[-12] > x[12]], na.rm = TRUE))
#[1] 42 40 52 39 50 51 49 49 24 27
or with rowSums
, we replace
the elements in the columns other than 12th, where thee value is less than or equal to mean
column and get the rowSums
rowSums(replace(df1[-12], df1[-12] <= df1[,12], NA), na.rm = TRUE)
#[1] 42 40 52 39 50 51 49 49 24 27
set.seed(24)
df1 <- as.data.frame(matrix(sample(1:8, 11 * 10, replace = TRUE),
ncol = 11, dimnames = list(NULL, paste0("id_group", 1:11))))
df1$mean_id <- sample(1:6, 10, replace = TRUE)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.