简体   繁体   English

通过自举的置信区间

[英]Confidence intervals via bootstrapping

Yesterday I began to read about using bootstrapping to determine confidence intervals (CIs) in many situations.昨天我开始阅读有关在许多情况下使用自举法确定置信区间 (CI) 的内容。 My current situation is that I am trying to estimate three parameters in a model via maximum likelihood estimation (MLE).我目前的情况是我试图通过最大似然估计(MLE)来估计模型中的三个参数。 This I have done, and now I need to define my CIs.这我已经完成了,现在我需要定义我的 CI。 This can obviously be done via profile likelihood, but bootstrapping will give a more broad CI interval as far as I can read.这显然可以通过配置文件可能性来完成,但就我所知,引导将提供更广泛的 CI 间隔。 My problem is that I am unsure on how to actually perform bootstrapping ?我的问题是我不确定如何实际执行引导? I have written my own code for the parameter estimation, so no build-in MLE calculators.我已经为参数估计编写了自己的代码,因此没有内置 MLE 计算器。

Basically the observed data I have is binary data, so 1 or 0. And it's from those data (put into a model with three parameters) that I have tried to estimate the parameter values.基本上,我拥有的观察数据是二进制数据,所以是 1 或 0。我试图从这些数据(放入具有三个参数的模型中)中估计参数值。

So let's say my cohort is 500, is the idea then that I take a sample from my cohort, maybe 100, and then expand it to 500 again by just multiplying the sample 5 times, and run the simulation once again, which in turn should result in some new parameter estimates, and then just do this 1000-2000 times in order to get a series of parameter values, which can then be used to define the CI ?假设我的队列是 500,那么我的想法是从我的队列中抽取一个样本,可能是 100,然后通过将样本乘以 5 次再次将其扩展到 500,然后再次运行模拟,这反过来应该导致一些新的参数估计,然后只执行 1000-2000 次以获得一系列参数值,然后可以用来定义 CI?

Or am I missing something here ?还是我在这里遗漏了什么?

This question isn't related to Python.这个问题与 Python 无关。 I think you need to read an intro to bootstrapping.我认为您需要阅读引导程序的介绍。 "An Introduciton to Statistical Learning" provides a good one. “统计学习简介”提供了一个很好的方法。 The idea is not to sample 100 -- you must sample with replacement and taking the same sample size (500).这个想法不是抽样 100 - 您必须进行替换抽样并采用相同的样本量(500)。 Yes, then you reestimate your parameter many times.是的,然后您多次重新估计您的参数。 And then there's several ways of taking all of these estimates and turning them into a confidence interval.然后有几种方法可以将所有这些估计值转换为置信区间。 For example, you can use them to estimate the standard error (the standard deviation of the sampling distribution), and then use +/- 2*se.例如,您可以使用它们来估计标准误差(抽样分布的标准偏差),然后使用 +/- 2*se。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM