Here is my example dataset
set.seed(123)
myd <- data.frame (sub = paste ("S", 1:10, sep = ""), P1 = sample(c(1,-1,2,0), 10, replace = TRUE),
P2 = sample(c(1,-1,2,0), 10, replace = TRUE),
I1 = sample(c(1,-1,2,0), 10, replace = TRUE),
I2 = sample(c(1,-1,2,0), 10, replace = TRUE),
I3 = sample(c(1,-1,2,0), 10, replace = TRUE),
I4 = sample(c(1,-1,2,0), 10, replace = TRUE),
I5 = sample(c(1,-1,2,0), 10, replace = TRUE),
I6 = sample(c(1,-1,2,0), 10, replace = TRUE)
)
myd
sub P1 P2 I1 I2 I3 I4 I5 I6
1 S1 -1 0 0 0 1 1 2 0
2 S2 0 -1 2 0 -1 -1 1 2
3 S3 -1 2 2 2 -1 0 -1 2
4 S4 0 2 0 0 -1 1 -1 1
5 S5 0 1 2 1 1 2 0 -1
6 S6 1 0 2 -1 1 1 -1 1
7 S7 2 1 2 0 1 1 0 -1
8 S8 0 1 2 1 -1 0 0 2
9 S9 2 -1 -1 -1 -1 0 0 -1
10 S10 -1 0 1 1 0 -1 -1 1
Translation table for incorrect values conditioned on values P1 and P2: -1 is missing value
Condition P1 P2 The value Incorrect
I 1 1 None
II 1 0 2
III 0 1 2
IV 2 0 2 or 0
V 0 2 2 or 0
VI 2 2 1 or 0
VII 1 2 0
VIII 2 1 0
# if there is -1 in any of the value produce all values NA
IX -1 0 NA
X 0 -1 NA
XI -1 -1 NA
XII -1 2 NA
XIII 2 -1 NA
XIV -1 1 NA
XV 1 -1 NA
The following is short code for transition table in data.frame format except** for IV, V, VI conditions where I did not know how to enter as there are two values:
ttable <- data.frame (P1 = c(1,1,0,2,0,2,1,2,-1, 0,-1,-1,2,-1,1),
P2 = c(1,0,1,0,2,2,2,1,0,-1,-1,2,-1,1,1),
errort = c("None", 2,2,2, 2,1,0,0,NA, NA, NA, NA, NA, NA,NA))
What I am trying to look at for each s1 to s10 rows, I would like to check values in P1 and P2 column and match this with the values in I1 to I6 column:
sub P1 P2 I1 I2 I3 I4 I5 I6
1 S1 -1 0 0 0 1 1 2 0
In this case P1 and P2 one of value is -1 so all values will be NA.
Another case:
sub P1 P2 I1 I2 I3 I4 I5 I6
S4 0 2 0 0 -1 1 -1 1
Here P1 = 0, P2 = 2, so the following values I1 = Incorrect, I2 = Incorrect, I3 = NA, I4 = correct, I5 = NA, I6 = correct
May be written as
sub P1 P2 I1 I2 I3 I4 I5 I6
S4 0 2 0 0 -1 1 -1 1
FALSE, FALSE, NA, TRUE, NA, TRUE
This match with condition (V) and either 0 or 1 are incorrect while 1 is correct and -1 is missing
Another case: here P1 = 0 and P2 =1, match with condition (III) in match table, thus incorrect values would be 2.
5 S5 0 1 2 1 1 2 0 -1
FALSE, TRUE, TRUE FALSE TRUE NA
I need to calculated frequency of false, I tried a lot of if-else statements but not giving desired output, I feel messey with many of these and I do not think this efficient for a large dataset I will be using.
qcfun <- function (x) {
x <- x[3:length(x)]
obs1 = table(c(x, 2, 0, 1, -1))
obs = obs1-1
ov <- NULL
if (x[1] == 1 & x[2] == 0){
ov = round (as.numeric (obs[4]/sum(obs)), 2)
} else {
if (x[1] == 0 & x[2] == 1){
ov = round (as.numeric (obs[4]/sum(obs)), 2)
} else {
if (x[1] == 1 & x[2] == 2){
ov = round (as.numeric (obs[2]/sum(obs)), 2)
} else {
if (x[1] == 2 & x[2] == 1){
ov = round (as.numeric (obs[2]/sum(obs)), 2)
} else {
if (x[1] == 1 & x[2] == 1){
ov = 0
} else {
ov = NA
}
}}}}
return (ov)
}
out1 <- apply(myd, 1,qcfun )
table (out1)
tout1 <- table (out1)
Is there a quick / efficient way of doing this?
You can use this vectorized function, it will be efficient for a large number of rows:
fixI <- function(p1, p2, i){
negative <- (p1 < 0) | (p2 < 0) | (i < 0)
result <- ifelse(negative, NA, TRUE) # conditions IX to XV
p <- p1 * 10 + p2
result[!negative & p %in% c(10,1,20,2) & i==2] <- FALSE
result[!negative & p %in% c(20,2,22,12,21) & i==0] <- FALSE
result[!negative & p==22 & i==1] <- FALSE
result
}
Apply it to I
columns in myd
:
mat <- sapply(myd[,paste0("I",1:6)], fixI, p1=myd$P1, p2=myd$P2)
rownames(mat) <- myd$sub
Result:
I1 I2 I3 I4 I5 I6
S1 NA NA NA NA NA NA
S2 NA NA NA NA NA NA
S3 NA NA NA NA NA NA
S4 FALSE FALSE NA TRUE NA TRUE
S5 FALSE TRUE TRUE FALSE TRUE NA
S6 FALSE NA TRUE TRUE NA TRUE
S7 TRUE FALSE TRUE TRUE FALSE NA
S8 FALSE TRUE NA TRUE TRUE FALSE
S9 NA NA NA NA NA NA
S10 NA NA NA NA NA NA
Now you can count FALSE
s like this:
By row:
apply(!mat, 1, sum, na.rm=TRUE)
S1 S2 S3 S4 S5 S6 S7 S8 S9 S10
0 0 0 2 2 1 2 2 0 0
By column:
apply(!mat, 2, sum, na.rm=TRUE)
I1 I2 I3 I4 I5 I6
4 2 0 1 1 1
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.