简体   繁体   English

将Zelig“ sim”函数与Amelia数据集结合使用,以获取跨R中的估算数据集汇总的估计值

[英]Using Zelig “sim” function with Amelia dataset to obtain estimates pooled across imputed datasets in R

I am using a multiply imputed dataset with Amelia and would then like Zelig to calculate predicted values from a regression model. 我正在使用Amelia的乘法估算数据集,然后希望Zelig从回归模型计算预测值。 Zelig's documentation states that "When quantities of interest are plotted, such as expected and predicted values and first differenences, these are correctly pooled across those from each of the m imputed datasets". Zelig的文档指出:“绘制感兴趣的数量时,例如期望值和预测值以及一阶差异,它们将正确地汇集到m个估算数据集中的每个数据集中”。 This is true, but I would also like to obtain estimated values pooled across each of the imputed datasets as the output of the "sim" command. 的确如此,但是我还想获得跨每个估算数据集汇总的估计值,作为“ sim”命令的输出。

Here is sample code replicating the instructions on the Zelig webiste and generating the same output: 以下示例代码复制了Zelig网站上的说明并生成相同的输出:

library("Amelia")
data(africa)
a.out <- amelia(x = africa, m=5, cs = "country", ts = "year", logs = "gdp_pc")
z.out <- zelig(gdp_pc ~ trade + civlib, model = "ls", data = a.out)
summary(z.out)

I then use "setx" to estimate the predicted values of the DV (gdp_pc) when "trade" is set at values of 50 and 100. 然后,当“交易”设置为50和100时,我使用“ setx”来估计DV的预测值(gdp_pc)。

x.out <- setx (z.out, trade = c(50,100))
x.out
range:
  (Intercept) trade civlib
1           1    50  0.289
2           1   100  0.289

Next step: Use 'sim' method

If I then use "sim" and "plot", R generates a plot with the estimates I requested: 如果我随后使用“ sim”和“ plot”,则R用我请求的估计值生成一个图:

s.out <- sim (z.out, x = x.out)
plot(s.out)

However, I would like to have a printout of the predicted values and their standard errors and values at different confidence intervals pooled across all the imputed datasets according to the Rubin rule . 但是,我想根据Rubin规则在所有估算数据集上汇总输出预测值及其标准误差和不同置信区间的值。 This is not what the "summary" command seems to be doing: 这不是“摘要”命令似乎正在执行的操作:

summary(s.out)
[1] 50


 sim range :
 -----
ev
     mean     sd      50%     2.5%   97.5%
1 844.843 30.567 845.1218 791.8107 908.658
pv
         mean       sd      50%     2.5%    97.5%
[1,] 857.6479 372.9689 852.9239 157.7842 1553.552

 sim range :
 -----
ev
      mean       sd      50%     2.5%    97.5%
1 836.2505 36.72892 833.3876 770.7931 908.7371
pv
         mean      sd      50%     2.5%    97.5%
[1,] 821.3542 359.461 790.5742 204.7687 1483.275

 sim range :
 -----
ev
     mean       sd      50%     2.5%    97.5%
1 837.307 34.99979 839.4895 765.0043 896.1513
pv
         mean       sd      50%     2.5%    97.5%
[1,] 831.6275 347.4005 844.0667 120.8968 1526.509

 sim range :
 -----
ev
      mean       sd      50%     2.5%    97.5%
1 838.1396 33.49521 837.6317 776.3413 901.4235
pv
         mean       sd      50%     2.5%    97.5%
[1,] 866.5946 364.2909 830.9851 263.8757 1594.664

 sim range :
 -----
ev
     mean       sd      50%     2.5%    97.5%
1 842.784 35.18827 843.5563 779.9052 914.5869
pv
         mean       sd      50%     2.5%    97.5%
[1,] 834.7425 350.5647 834.0003 228.0261 1527.293


[1] 100


 sim range :
 -----
ev
      mean       sd      50%    2.5%    97.5%
1 1743.969 54.06692 1742.795 1627.39 1840.744
pv
        mean       sd      50%     2.5%    97.5%
[1,] 1700.53 350.1268 1718.504 1047.998 2322.216

 sim range :
 -----
ev
      mean       sd      50%     2.5%    97.5%
1 1748.554 58.46152 1755.443 1634.345 1854.652
pv
         mean       sd      50%     2.5%    97.5%
[1,] 1734.831 340.8356 1734.907 1071.973 2347.156

 sim range :
 -----
ev
      mean       sd      50%     2.5%    97.5%
1 1741.014 63.86164 1741.492 1615.497 1863.306
pv
         mean       sd      50%   2.5%    97.5%
[1,] 1759.305 329.6513 1746.153 1172.5 2435.067

 sim range :
 -----
ev
      mean       sd      50%     2.5%    97.5%
1 1738.422 64.75221 1738.474 1615.078 1854.675
pv
         mean       sd      50%     2.5%    97.5%
[1,] 1728.152 386.8327 1761.047 849.7188 2395.825

 sim range :
 -----
ev
      mean       sd      50%     2.5%    97.5%
1 1746.575 53.02558 1744.919 1638.602 1848.114
pv
         mean       sd      50%    2.5%    97.5%
[1,] 1710.864 342.1865 1702.769 1050.85 2288.021

Here, I get all the values for each of the imputed datasets, instead of the values pooled across all multiply imputed datasets. 在这里,我获得了每个估算数据集的所有值,而不是在所有乘法估算数据集中汇总的值。 Is there a way to get Zelig to apply the Rubin rule to the multiply imputed datasets when providing summary statistics of the predicted estimates, as well as when drawing charts based on them? 当提供预测估计值的摘要统计信息以及基于这些估计值绘制图表时,是否有办法让Zelig将Rubin规则应用于乘法估算数据集?

Note: the application I need would require negative binomial regression , not linear regression, to be the model used in Zelig. 注意:我需要的应用程序需要负二项式回归 ,而不是线性回归,才能成为Zelig中使用的模型。 I have used this example to replicate the example provided by the Zelig developers. 我已使用此示例复制Zelig开发人员提供的示例。

Many thanks for your help, and have a lovely day! 非常感谢您的帮助,祝您有美好的一天!

You don't need to use Rubin's rules in this case, since the uncertainty is calculated from the variance in the simulations. 在这种情况下,您无需使用鲁宾规则,因为不确定性是根据模拟中的方差计算得出的。 I'm a bit surprised that Zelig doesn't average these for you, but you can do it yourself without too much difficulty: Zelig没有为您平均这些,我感到有些惊讶,但是您可以自己做,没有太多困难:

qi.out <- zelig_qi_to_df(s.out)

lapply(split(qi.out, qi.out["trade"]),
       function(x) c(trade = unique(x$trade),
                     mean = mean(x$expected_value),
                     sd = sd(x$expected_value),
                     median = median(x$expected_value),
                     quantile(x$expected_value, probs = c(0.5, 0.025, 0.975))))

lapply(split(qi.out, qi.out["trade"]),
       function(x) c(trade = unique(x$trade),
                     mean = mean(x$predicted_value),
                     sd = sd(x$predicted_value),
                     median = median(x$predicted_value),
                     quantile(x$predicted_value, probs = c(0.5, 0.025, 0.975))))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM