Storing simulation results in R

Question

I want to estimate Mantel-Haenszel Differential Item Functioning (DIF) Odds Ratio and HMDDIF index. I wrote the function below. It seems to me I am making a mistake when storing the results. Would you please take a look at this and give me feedback? Here is the sample data:

# generate dataset
r <- 1000
c <- 16
test <- matrix(rbinom(r*c,1,0.5),r,c)
# create sum scores for each student using first 15 columns
test <- cbind(test, apply(test[,1:15],1,sum))
colnames(test) <- c("v1","v2","v3","v4","v5","v6","v7","v8","v9","v10","v11","v12","v13","v14","v15","group","score")
test <- as.data.frame(test)

The first 15 columns are the student True/false responses to items/questions. The group membership column is the 16th column. The student "score" variable is the sum of item scores at the last (17th) column. The formula can be found here in the picture that I got from Wikipedia ( https://en.wikipedia.org/wiki/Differential_item_functioning ).

For each of the score category, I want to estimate the last two formulas in this picture. Rows are 10 students and columns are six items/questions. Again, the 16th column is group membership (1-focal, 0-reference) Here is my function code.

    library(dplyr)

# this function first starts with the first item and loop k scores from 1-15. Then move to the second item.
# data should only contain the items, grouping variable, and person score.

Mantel.Haenszel <- function (data) { 
  # browser() #runs with debug
  for (item in 1:15) { #item loop not grouping/scoring

    item.incorrect <- data[,item] == 0 
    item.correct   <- data[,item] == 1
    Results <-  c() 

    for (k in 1:15) { # for k scores

        Ak <- nrow(filter(data, score == k, group == 0, item.correct)) #  freq of ref group & correct

        Bk <- nrow(filter(data, score == k, group == 0, item.incorrect)) #  freq of ref group & incorrect

        Ck <- nrow(filter(data, score == k, group == 1, item.correct)) #  freq of foc group & correct

        Dk <- nrow(filter(data, score == k, group == 1, item.incorrect)) #  freq of foc group & incorrect

        nrk <- nrow(filter(data, score == k, group == 0)) #sample size for ref

        nfk <- nrow(filter(data, score == k, group == 1)) #sample size for focal

        if (Bk == 0 | Ck == 0) { 

          next
        }

      nominator   <-sum((Ak*Dk)/(nrk + nfk))
      denominator <-sum((Bk*Ck)/(nrk + nfk))
      odds.ratio  <- nominator/denominator

       if (odds.ratio == 0) { 

        next
      }

      MH.D.DIF <- (-2.35)*log(odds.ratio) #index

      # save the output
      out <- list("Odds Ratio" = odds.ratio, "MH Diff" = MH.D.DIF)
      results <- rbind(Results, out)
      return(results)

    } # close score loop

  } # close item loop

 } #close function

Here is what I get

# test funnction
Mantel.Haenszel(test)

> Mantel.Haenszel(test)
    Odds Ratio MH Diff 
out 0.2678571  3.095659

What I want to get is

> Mantel.Haenszel(test)
    Odds Ratio MH Diff 
out 0.2678571  3.095659
    ##         ##
    ..         ..
    (15 rows here for 15 score categories in the dataset)

Answer 1

Should you not expect a result for every combination of item and k , for a max number of output rows of 225, barring any instances with break ? If so, I think you just need to change a few minor things. First, declare Results only once, at the beginning of your function. Then, make sure you are rbind -ing and returning either Results or results, but not both. Then, move your results, but not both. Then, move your return to your actual function level rather than the loops. In the example below I've also included the current item and k for demonstration:

Mantel.Haenszel <- function (data) {
  # browser() #runs with debug

  Results <-  c()

  for (item in 1:15) {
    #item loop not grouping/scoring

    item.incorrect <- data[, item] == 0
    item.correct   <- data[, item] == 1

    for (k in 1:15) {
      # for k scores

      Ak <-
        nrow(filter(data, score == k, group == 0, item.correct)) #  freq of ref group & correct

      Bk <-
        nrow(filter(data, score == k, group == 0, item.incorrect)) #  freq of ref group & incorrect

      Ck <-
        nrow(filter(data, score == k, group == 1, item.correct)) #  freq of foc group & correct

      Dk <-
        nrow(filter(data, score == k, group == 1, item.incorrect)) #  freq of foc group & incorrect

      nrk <-
        nrow(filter(data, score == k, group == 0)) #sample size for ref

      nfk <-
        nrow(filter(data, score == k, group == 1)) #sample size for focal

      if (Bk == 0 | Ck == 0) {
        next
      }

      nominator   <- sum((Ak * Dk) / (nrk + nfk))
      denominator <- sum((Bk * Ck) / (nrk + nfk))
      odds.ratio  <- nominator / denominator

      if (odds.ratio == 0) {
        next
      }

      MH.D.DIF <- (-2.35) * log(odds.ratio) #index

      # save the output
      out <-
        list(
          item = item,
          k = k,
          "Odds Ratio" = odds.ratio,
          "MH Diff" = MH.D.DIF
        )
      Results <- rbind(Results, out)
    } # close score loop

  } # close item loop

  return(Results)

} #close function

test.output <- Mantel.Haenszel(test)

Gives an output like:

> head(test.output, 20)
    item k  Odds Ratio MH Diff    
out 1    3  2          -1.628896  
out 1    4  4.666667   -3.620046  
out 1    5  0.757085   0.6539573  
out 1    6  0.5823986  1.27041    
out 1    7  0.9893293  0.02521097 
out 1    8  1.078934   -0.1785381 
out 1    9  1.006237   -0.01461145
out 1    10 1.497976   -0.9496695 
out 1    11 1.435897   -0.8502066 
out 1    12 1.5        -0.952843  
out 2    3  0.8333333  0.4284557  
out 2    4  2.424242   -2.08097   
out 2    5  1.368664   -0.7375117 
out 2    6  1.222222   -0.4715761 
out 2    7  0.6288871  1.089938   
out 2    8  1.219512   -0.4663597 
out 2    9  1          0          
out 2    10 2.307692   -1.965183  
out 2    11 0.6666667  0.952843   
out 2    12 0.375      2.304949

Is that what you're looking for?

Storing simulation results in R

Question

1 answers

solution1
1 ACCPTED 2018-07-11 21:44:04

Storing simulation results in R

Question

1 answers

solution1 1 ACCPTED 2018-07-11 21:44:04

solution1
1 ACCPTED 2018-07-11 21:44:04