[英]Storing simulation results in R
I want to estimate Mantel-Haenszel Differential Item Functioning (DIF) Odds Ratio and HMDDIF index. 我想估计Mantel-Haenszel差异项功能(DIF)赔率和HMDDIF指数。 I wrote the function below. 我在下面编写了函数。 It seems to me I am making a mistake when storing the results. 在我看来,存储结果时我犯了一个错误。 Would you please take a look at this and give me feedback? 您能看看这个并给我反馈吗? Here is the sample data: 这是示例数据:
# generate dataset
r <- 1000
c <- 16
test <- matrix(rbinom(r*c,1,0.5),r,c)
# create sum scores for each student using first 15 columns
test <- cbind(test, apply(test[,1:15],1,sum))
colnames(test) <- c("v1","v2","v3","v4","v5","v6","v7","v8","v9","v10","v11","v12","v13","v14","v15","group","score")
test <- as.data.frame(test)
The first 15 columns are the student True/false responses to items/questions. 前15列是学生对项目/问题的正确/错误答案。 The group membership column is the 16th column. 组成员资格列是第16列。 The student "score" variable is the sum of item scores at the last (17th) column. 学生的“得分”变量是最后一列(第17列)的项目得分总和。 The formula can be found here in the picture that I got from Wikipedia ( https://en.wikipedia.org/wiki/Differential_item_functioning ). 在我从Wikipedia( https://en.wikipedia.org/wiki/Differential_item_functioning )获得的图片中,可以找到该公式。
For each of the score category, I want to estimate the last two formulas in this picture. 对于每个分数类别,我想估计这张图中的最后两个公式。 Rows are 10 students and columns are six items/questions. 行是10个学生,列是六个项目/问题。 Again, the 16th column is group membership (1-focal, 0-reference) Here is my function code. 同样,第16列是组成员身份(1-焦点,0引用)这是我的功能代码。
library(dplyr)
# this function first starts with the first item and loop k scores from 1-15. Then move to the second item.
# data should only contain the items, grouping variable, and person score.
Mantel.Haenszel <- function (data) {
# browser() #runs with debug
for (item in 1:15) { #item loop not grouping/scoring
item.incorrect <- data[,item] == 0
item.correct <- data[,item] == 1
Results <- c()
for (k in 1:15) { # for k scores
Ak <- nrow(filter(data, score == k, group == 0, item.correct)) # freq of ref group & correct
Bk <- nrow(filter(data, score == k, group == 0, item.incorrect)) # freq of ref group & incorrect
Ck <- nrow(filter(data, score == k, group == 1, item.correct)) # freq of foc group & correct
Dk <- nrow(filter(data, score == k, group == 1, item.incorrect)) # freq of foc group & incorrect
nrk <- nrow(filter(data, score == k, group == 0)) #sample size for ref
nfk <- nrow(filter(data, score == k, group == 1)) #sample size for focal
if (Bk == 0 | Ck == 0) {
next
}
nominator <-sum((Ak*Dk)/(nrk + nfk))
denominator <-sum((Bk*Ck)/(nrk + nfk))
odds.ratio <- nominator/denominator
if (odds.ratio == 0) {
next
}
MH.D.DIF <- (-2.35)*log(odds.ratio) #index
# save the output
out <- list("Odds Ratio" = odds.ratio, "MH Diff" = MH.D.DIF)
results <- rbind(Results, out)
return(results)
} # close score loop
} # close item loop
} #close function
Here is what I get 这就是我得到的
# test funnction
Mantel.Haenszel(test)
> Mantel.Haenszel(test)
Odds Ratio MH Diff
out 0.2678571 3.095659
What I want to get is 我想要得到的是
> Mantel.Haenszel(test)
Odds Ratio MH Diff
out 0.2678571 3.095659
## ##
.. ..
(15 rows here for 15 score categories in the dataset)
Should you not expect a result for every combination of item
and k
, for a max number of output rows of 225, barring any instances with break
? 您不应该为item
和k
每种组合期望结果,最大输出行数为225,除非有break
任何实例? If so, I think you just need to change a few minor things. 如果是这样,我认为您只需要更改一些小事情。 First, declare Results
only once, at the beginning of your function. 首先,在函数开始时只声明一次Results
。 Then, make sure you are rbind
-ing and returning either Results
or results, but not both. Then, move your
然后,确保您正在rbind
并返回Results
或results, but not both. Then, move your
返回results, but not both. Then, move your
results, but not both. Then, move your
return to your actual function level rather than the loops. results, but not both. Then, move your
返回值results, but not both. Then, move your
到实际功能级别,而不是循环。 In the example below I've also included the current item
and k
for demonstration: 在下面的示例中,我还包括了当前item
和k
进行演示:
Mantel.Haenszel <- function (data) {
# browser() #runs with debug
Results <- c()
for (item in 1:15) {
#item loop not grouping/scoring
item.incorrect <- data[, item] == 0
item.correct <- data[, item] == 1
for (k in 1:15) {
# for k scores
Ak <-
nrow(filter(data, score == k, group == 0, item.correct)) # freq of ref group & correct
Bk <-
nrow(filter(data, score == k, group == 0, item.incorrect)) # freq of ref group & incorrect
Ck <-
nrow(filter(data, score == k, group == 1, item.correct)) # freq of foc group & correct
Dk <-
nrow(filter(data, score == k, group == 1, item.incorrect)) # freq of foc group & incorrect
nrk <-
nrow(filter(data, score == k, group == 0)) #sample size for ref
nfk <-
nrow(filter(data, score == k, group == 1)) #sample size for focal
if (Bk == 0 | Ck == 0) {
next
}
nominator <- sum((Ak * Dk) / (nrk + nfk))
denominator <- sum((Bk * Ck) / (nrk + nfk))
odds.ratio <- nominator / denominator
if (odds.ratio == 0) {
next
}
MH.D.DIF <- (-2.35) * log(odds.ratio) #index
# save the output
out <-
list(
item = item,
k = k,
"Odds Ratio" = odds.ratio,
"MH Diff" = MH.D.DIF
)
Results <- rbind(Results, out)
} # close score loop
} # close item loop
return(Results)
} #close function
test.output <- Mantel.Haenszel(test)
Gives an output like: 给出类似的输出:
> head(test.output, 20)
item k Odds Ratio MH Diff
out 1 3 2 -1.628896
out 1 4 4.666667 -3.620046
out 1 5 0.757085 0.6539573
out 1 6 0.5823986 1.27041
out 1 7 0.9893293 0.02521097
out 1 8 1.078934 -0.1785381
out 1 9 1.006237 -0.01461145
out 1 10 1.497976 -0.9496695
out 1 11 1.435897 -0.8502066
out 1 12 1.5 -0.952843
out 2 3 0.8333333 0.4284557
out 2 4 2.424242 -2.08097
out 2 5 1.368664 -0.7375117
out 2 6 1.222222 -0.4715761
out 2 7 0.6288871 1.089938
out 2 8 1.219512 -0.4663597
out 2 9 1 0
out 2 10 2.307692 -1.965183
out 2 11 0.6666667 0.952843
out 2 12 0.375 2.304949
Is that what you're looking for? 那是您要找的东西吗?
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.