I want to estimate Mantel-Haenszel Differential Item Functioning (DIF) Odds Ratio and HMDDIF index. I wrote the function below. It seems to me I am making a mistake when storing the results. Would you please take a look at this and give me feedback? Here is the sample data:
# generate dataset
r <- 1000
c <- 16
test <- matrix(rbinom(r*c,1,0.5),r,c)
# create sum scores for each student using first 15 columns
test <- cbind(test, apply(test[,1:15],1,sum))
colnames(test) <- c("v1","v2","v3","v4","v5","v6","v7","v8","v9","v10","v11","v12","v13","v14","v15","group","score")
test <- as.data.frame(test)
The first 15 columns are the student True/false responses to items/questions. The group membership column is the 16th column. The student "score" variable is the sum of item scores at the last (17th) column. The formula can be found here in the picture that I got from Wikipedia ( https://en.wikipedia.org/wiki/Differential_item_functioning ).
For each of the score category, I want to estimate the last two formulas in this picture. Rows are 10 students and columns are six items/questions. Again, the 16th column is group membership (1-focal, 0-reference) Here is my function code.
library(dplyr)
# this function first starts with the first item and loop k scores from 1-15. Then move to the second item.
# data should only contain the items, grouping variable, and person score.
Mantel.Haenszel <- function (data) {
# browser() #runs with debug
for (item in 1:15) { #item loop not grouping/scoring
item.incorrect <- data[,item] == 0
item.correct <- data[,item] == 1
Results <- c()
for (k in 1:15) { # for k scores
Ak <- nrow(filter(data, score == k, group == 0, item.correct)) # freq of ref group & correct
Bk <- nrow(filter(data, score == k, group == 0, item.incorrect)) # freq of ref group & incorrect
Ck <- nrow(filter(data, score == k, group == 1, item.correct)) # freq of foc group & correct
Dk <- nrow(filter(data, score == k, group == 1, item.incorrect)) # freq of foc group & incorrect
nrk <- nrow(filter(data, score == k, group == 0)) #sample size for ref
nfk <- nrow(filter(data, score == k, group == 1)) #sample size for focal
if (Bk == 0 | Ck == 0) {
next
}
nominator <-sum((Ak*Dk)/(nrk + nfk))
denominator <-sum((Bk*Ck)/(nrk + nfk))
odds.ratio <- nominator/denominator
if (odds.ratio == 0) {
next
}
MH.D.DIF <- (-2.35)*log(odds.ratio) #index
# save the output
out <- list("Odds Ratio" = odds.ratio, "MH Diff" = MH.D.DIF)
results <- rbind(Results, out)
return(results)
} # close score loop
} # close item loop
} #close function
Here is what I get
# test funnction
Mantel.Haenszel(test)
> Mantel.Haenszel(test)
Odds Ratio MH Diff
out 0.2678571 3.095659
What I want to get is
> Mantel.Haenszel(test)
Odds Ratio MH Diff
out 0.2678571 3.095659
## ##
.. ..
(15 rows here for 15 score categories in the dataset)
Should you not expect a result for every combination of item
and k
, for a max number of output rows of 225, barring any instances with break
? If so, I think you just need to change a few minor things. First, declare Results
only once, at the beginning of your function. Then, make sure you are rbind
-ing and returning either Results
or results, but not both. Then, move your
results, but not both. Then, move your
return to your actual function level rather than the loops. In the example below I've also included the current item
and k
for demonstration:
Mantel.Haenszel <- function (data) {
# browser() #runs with debug
Results <- c()
for (item in 1:15) {
#item loop not grouping/scoring
item.incorrect <- data[, item] == 0
item.correct <- data[, item] == 1
for (k in 1:15) {
# for k scores
Ak <-
nrow(filter(data, score == k, group == 0, item.correct)) # freq of ref group & correct
Bk <-
nrow(filter(data, score == k, group == 0, item.incorrect)) # freq of ref group & incorrect
Ck <-
nrow(filter(data, score == k, group == 1, item.correct)) # freq of foc group & correct
Dk <-
nrow(filter(data, score == k, group == 1, item.incorrect)) # freq of foc group & incorrect
nrk <-
nrow(filter(data, score == k, group == 0)) #sample size for ref
nfk <-
nrow(filter(data, score == k, group == 1)) #sample size for focal
if (Bk == 0 | Ck == 0) {
next
}
nominator <- sum((Ak * Dk) / (nrk + nfk))
denominator <- sum((Bk * Ck) / (nrk + nfk))
odds.ratio <- nominator / denominator
if (odds.ratio == 0) {
next
}
MH.D.DIF <- (-2.35) * log(odds.ratio) #index
# save the output
out <-
list(
item = item,
k = k,
"Odds Ratio" = odds.ratio,
"MH Diff" = MH.D.DIF
)
Results <- rbind(Results, out)
} # close score loop
} # close item loop
return(Results)
} #close function
test.output <- Mantel.Haenszel(test)
Gives an output like:
> head(test.output, 20)
item k Odds Ratio MH Diff
out 1 3 2 -1.628896
out 1 4 4.666667 -3.620046
out 1 5 0.757085 0.6539573
out 1 6 0.5823986 1.27041
out 1 7 0.9893293 0.02521097
out 1 8 1.078934 -0.1785381
out 1 9 1.006237 -0.01461145
out 1 10 1.497976 -0.9496695
out 1 11 1.435897 -0.8502066
out 1 12 1.5 -0.952843
out 2 3 0.8333333 0.4284557
out 2 4 2.424242 -2.08097
out 2 5 1.368664 -0.7375117
out 2 6 1.222222 -0.4715761
out 2 7 0.6288871 1.089938
out 2 8 1.219512 -0.4663597
out 2 9 1 0
out 2 10 2.307692 -1.965183
out 2 11 0.6666667 0.952843
out 2 12 0.375 2.304949
Is that what you're looking for?
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.