简体   繁体   English

伯努利数据模型的Rao分数测试的R代码是否正确?

[英]Is this R code of Rao score test for the Bernoulli data model correct?

I am a complete statistical noob and new to R, hence the question. 我是一个完全统计的菜鸟,并且是R的新手,因此是个问题。 I've tried to find an implementation of the Rao score for the particular case when one's data is binary and each observation has bernoulli distribution. 当一个人的data是二进制的并且每个观察结果都具有bernoulli分布时,我试图找到一种针对特定情况的饶得分的实现。 I stumbled upon anova in the R language but failed to understand how to use that. 我偶然发现了R语言的anova ,但不了解如何使用它。 Therefore, I tried implementing Rao score for this particular case myself: 因此,我尝试针对此特殊情况实施Rao评分:

rao.score.bern <- function(data, p0) {
  # assume `data` is a list of 0s and 1s
  y <- sum(data)
  n <- length(data)
  phat <- y / n

  z <- (phat - p0) / sqrt(p0 * (1 - p0) / n)
  p.value <- 2 * (1 - pnorm(abs(z)))
}

I am pretty sure that there is a bug in my code because it produces only two distinct p-values in the following scenario: 我很确定我的代码中有一个错误,因为在以下情况下它只会产生两个不同的p值:

p0 <- 1 / 4
p <- seq(from=0.01, to=0.5, by=0.01)
n <- seq(from=5, to=70, by=1)
g <- expand.grid(n, p)

data <- apply(g, 1, function(x) rbinom(x[1], 1, x[2]))
p.values <- sapply(data, function(x) rao.score.bern(x[[1]], p0))

Could someone please show me where the problem is? 有人可以告诉我问题出在哪里吗? Could you perhaps point me to a built-in solution in R? 您能否为我指出R中的内置解决方案?

First test, then debug. 首先测试,然后调试。

Test 测试

Does rao.score.bern work at all? rao.score.bern工作吗?

rao.score.bern(c(0,0,0,1,1,1), 1/6)) rao.score.bern(c(0,0,0,1,1,1),1/6))

This returns...nothing! 这返回...什么都没有! Fix it by replacing the ultimate line by 通过替换最终行来修复它

2 * (1 - pnorm(abs(z)))

This eliminates the unnecessary assignment. 这消除了不必要的分配。

rao.score.bern(c(0,0,0,1,1,1), 1/6)) rao.score.bern(c(0,0,0,1,1,1),1/6))

[1] 0.02845974

OK, now we're getting somewhere. 好,现在我们到了某个地方。

Debug 除错

Unfortunately, the code still doesn't work. 不幸的是,该代码仍然无法正常工作。 Let's debug by yanking the call to rao.score.bern and replacing it by something that shows us the input. 让我们调试rao.score.bern ,取消对rao.score.bern的调用,并将其替换为向我们显示输入的内容。 Don't apply it to the large input you created! 不要将其应用于您创建的大型输入! Use a small piece of it: 使用其中的一小部分:

sapply(data[1:5], function(x) x[[1]]) sapply(数据[1:5],函数(x)x [[1]])

[1] 0 0 0 0 0

That's not what you expected, is it? 那不是您所期望的,是吗? It's returning just one zero for each element of data . 每个data元素只返回一个零。 What about this? 那这个呢?

sapply(data[1:5], function(x) x) sapply(data [1:5],function(x)x)

[[1]]
[1] 0 0 0 0 0
[[2]]
[1] 0 0 0 0 0 0 
...
[[5]]
[1] 0 0 0 0 0 0 0 0 0

Much better! 好多了! The variable x in the call to sapply refers to the entire vector, which is what you want to pass to your routine. sapply调用中的变量x指向整个向量,这就是您要传递给例程的内容。 Whence 何处

p.values <- sapply(data, function(x) rao.score.bern(x, p0)); p。值<-sapply(数据,函数(x)rao.score.bern(x,p0)); hist(p.values) hist(p.values)

数字

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM