I am calculating split-half reliability for certain behavioral items in my dataset and first need to grab the mean of the first 2 non-NA values per respondent followed by the last two non-NA values for each person (each row). I know there are ways to do this using packages runner
, zoo
and others by column, but I've yet to find a solution within rows.
For context, I designed a survey in which items were randomized in order to reduce item-level effects. Participants saw 1/2 of a random subset of items from a particular measurement scale at one point in the survey and the other 1/2 at a different point. Therefore, each participant will have the same number of non-NA as NA at each of the two-time points.
for instance, say I have 8 items total. Data for persons 1, 2, and 3 at time point 1 reads:
x1 x2 x3 x4 x5 x6 x7 x8
1 NA NA 2 NA 1 1 NA
NA 4 3 3 NA NA 4 NA
3 2 1 NA NA NA 3 NA
The resulting new variables (avg1 and avg2) should read:
x1 x2 x3 x4 x5 x6 x7 x8 avg1 avg2
1 NA NA 2 NA 1 1 NA 1.5 1
NA 4 3 3 NA NA 4 NA 3.5 3.5
3 2 1 NA NA NA 3 NA 2.5 2
any help is appreciated, thanks!
Here is one potential solution:
m <- as.matrix(read.table(text = "x1 x2 x3 x4 x5 x6 x7 x8
1 NA NA 2 NA 1 1 NA
NA 4 3 3 NA NA 4 NA
3 2 1 NA NA NA 3 NA ",
header = TRUE))
# Only keep non-NA values
m2 <- t(apply(m,1,function(x) c(x[!is.na(x)])))
# Select the first two non-NA values
m3 <- m2[,1:2]
# Select the second-last and last non-NA values
m4 <- m2[,(ncol(m2)-1):(ncol(m2))]
# Bind the matrix to the mean of the first two and the mean of the last two non-NA values
cbind(m, "avg1" = rowMeans(m3), "avg2" = rowMeans(m4))
#> x1 x2 x3 x4 x5 x6 x7 x8 avg1 avg2
#> [1,] 1 NA NA 2 NA 1 1 NA 1.5 1.0
#> [2,] NA 4 3 3 NA NA 4 NA 3.5 3.5
#> [3,] 3 2 1 NA NA NA 3 NA 2.5 2.0
Created on 2022-03-11 by the reprex package (v2.0.1)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.