简体   繁体   English

如何拆分在Amelia中创建的乘法估算数据集?

[英]How can I split a multiply imputed dataset created in Amelia?

I have imputed missing values using Amelia thereby creating 5 multiply imputed datasets. 我使用Amelia估算了缺失值,从而创建了5个乘法估算数据集。 Now, I would like to split this multi-dataset, eg one set for year => 1990 and one set for year =<1990. 现在,我想分割此多数据集,例如,一组用于year => 1990,而一组用于year = <1990。 Any ideas how I can do so? 有什么想法可以做到吗? Many thanks! 非常感谢!

data(freetrade)
freetrade$year #splitting variable

#Imputation of missing data
a.out <- amelia(freetrade, m=5, ts="year", cs="country")

#split of created dataset?

Amelia returns an object that contains a list of dataframes (for each imputations). Amelia返回一个包含数据框列表的对象(针对每个插补)。 You can see the structure of this object with str() . 您可以使用str()查看此对象的结构。

> library(Amelia)
> data(freetrade)
> 
> a.out <- amelia(freetrade, m=5, ts="year", cs="country")
-- Imputation 1 --

  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15

-- Imputation 2 --

  1  2  3  4  5  6  7  8  9 10 11 12 13

-- Imputation 3 --

  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19

-- Imputation 4 --

  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15

-- Imputation 5 --

  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20


> str(a.out)
List of 12
 $ imputations:List of 5
  ..$ imp1:'data.frame':    171 obs. of  10 variables:
  .. ..$ year    : int [1:171] 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 ...
  .. ..$ country : chr [1:171] "SriLanka" "SriLanka" "SriLanka" "SriLanka" ...
  .. ..$ tariff  : num [1:171] 30.6 22.4 41.3 26.8 31 ...
  .. ..$ polity  : num [1:171] 6 5 5 5 5 5 5 5 5 5 ...
  .. ..$ pop     : num [1:171] 14988000 15189000 15417000 15599000 15837000 ...
  .. ..$ gdp.pc  : num [1:171] 461 474 489 508 526 ...
  .. ..$ intresmi: num [1:171] 1.94 1.96 1.66 2.8 2.26 ...
  .. ..$ signed  : num [1:171] 0 0 1 0 0 0 0 1 0 0 ...
  .. ..$ fiveop  : num [1:171] 12.4 12.5 12.3 12.3 12.3 ...
  .. ..$ usheg   : num [1:171] 0.259 0.256 0.266 0.299 0.295 ...
  ..$ imp2:'data.frame':    171 obs. of  10 variables:
  .. ..$ year    : int [1:171] 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 ...
  .. ..$ country : chr [1:171] "SriLanka" "SriLanka" "SriLanka" "SriLanka" ...
  .. ..$ tariff  : num [1:171] 33.6 59.7 41.3 18.2 31 ...
  .. ..$ polity  : num [1:171] 6 5 5 5 5 5 5 5 5 5 ...
  .. ..$ pop     : num [1:171] 14988000 15189000 15417000 15599000 15837000 ...
  .. ..$ gdp.pc  : num [1:171] 461 474 489 508 526 ...
  .. ..$ intresmi: num [1:171] 1.94 1.96 1.66 2.8 2.26 ...
  .. ..$ signed  : num [1:171] 0 0 1 0 0 0 0 1 0 0 ...
  .. ..$ fiveop  : num [1:171] 12.4 12.5 12.3 12.3 12.3 ...
  .. ..$ usheg   : num [1:171] 0.259 0.256 0.266 0.299 0.295 ...
  ..$ imp3:'data.frame':    171 obs. of  10 variables:
  .. ..$ year    : int [1:171] 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 ...
  .. ..$ country : chr [1:171] "SriLanka" "SriLanka" "SriLanka" "SriLanka" ...
  .. ..$ tariff  : num [1:171] 48.5 32.9 41.3 47.2 31 ...
  .. ..$ polity  : num [1:171] 6 5 5 5 5 5 5 5 5 5 ...
  .. ..$ pop     : num [1:171] 14988000 15189000 15417000 15599000 15837000 ...
  .. ..$ gdp.pc  : num [1:171] 461 474 489 508 526 ...
  .. ..$ intresmi: num [1:171] 1.94 1.96 1.66 2.8 2.26 ...
  .. ..$ signed  : num [1:171] 0 0 1 0 0 0 0 1 0 0 ...
  .. ..$ fiveop  : num [1:171] 12.4 12.5 12.3 12.3 12.3 ...
  .. ..$ usheg   : num [1:171] 0.259 0.256 0.266 0.299 0.295 ...
  ..$ imp4:'data.frame':    171 obs. of  10 variables:
  .. ..$ year    : int [1:171] 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 ...
  .. ..$ country : chr [1:171] "SriLanka" "SriLanka" "SriLanka" "SriLanka" ...
  .. ..$ tariff  : num [1:171] 18.4 45.5 41.3 16.9 31 ...
  .. ..$ polity  : num [1:171] 6 5 5 5 5 5 5 5 5 5 ...
  .. ..$ pop     : num [1:171] 14988000 15189000 15417000 15599000 15837000 ...
  .. ..$ gdp.pc  : num [1:171] 461 474 489 508 526 ...
  .. ..$ intresmi: num [1:171] 1.94 1.96 1.66 2.8 2.26 ...
  .. ..$ signed  : num [1:171] 0 0 1 0 0 0 0 1 0 0 ...
  .. ..$ fiveop  : num [1:171] 12.4 12.5 12.3 12.3 12.3 ...
  .. ..$ usheg   : num [1:171] 0.259 0.256 0.266 0.299 0.295 ...
  ..$ imp5:'data.frame':    171 obs. of  10 variables:
  .. ..$ year    : int [1:171] 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 ...
  .. ..$ country : chr [1:171] "SriLanka" "SriLanka" "SriLanka" "SriLanka" ...
  .. ..$ tariff  : num [1:171] 15.3 44.4 41.3 40.1 31 ...
  .. ..$ polity  : num [1:171] 6 5 5 5 5 5 5 5 5 5 ...
  .. ..$ pop     : num [1:171] 14988000 15189000 15417000 15599000 15837000 ...
  .. ..$ gdp.pc  : num [1:171] 461 474 489 508 526 ...
  .. ..$ intresmi: num [1:171] 1.94 1.96 1.66 2.8 2.26 ...
  .. ..$ signed  : num [1:171] 0 0 1 0 0 0 0 1 0 0 ...
  .. ..$ fiveop  : num [1:171] 12.4 12.5 12.3 12.3 12.3 ...
  .. ..$ usheg   : num [1:171] 0.259 0.256 0.266 0.299 0.295 ...
  ..- attr(*, "class")= chr [1:2] "mi" "list"
 $ m          : num 5
 $ missMatrix : logi [1:171, 1:10] FALSE FALSE FALSE FALSE FALSE FALSE ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : NULL
  .. ..$ : chr [1:10] "year" "country" "tariff" "polity" ...
 $ overvalues : NULL
 $ theta      : num [1:9, 1:9, 1:5] -1 -0.08456 -0.03404 -0.00193 0.06483 ...
 $ mu         : num [1:8, 1:5] -0.08456 -0.03404 -0.00193 0.06483 -0.11178 ...
 $ covMatrices: num [1:8, 1:8, 1:5] 0.7881 -0.1869 -0.0531 0.2121 -0.0819 ...
 $ code       : num 1
 $ message    : chr "Normal EM convergence."
 $ iterHist   :List of 5
  ..$ : num [1:15, 1:3] 44 34 25 28 26 25 24 22 20 14 ...
  ..$ : num [1:13, 1:3] 44 27 24 22 22 21 18 17 14 11 ...
  ..$ : num [1:19, 1:3] 44 34 29 27 26 26 25 24 23 21 ...
  ..$ : num [1:15, 1:3] 44 34 27 28 23 24 23 23 19 19 ...
  ..$ : num [1:20, 1:3] 44 32 30 27 24 23 23 23 23 21 ...
 $ arguments  :List of 22
  ..$ idvars      : NULL
  ..$ logs        : NULL
  ..$ ts          : num 1
  ..$ cs          : num 2
  ..$ empri       : NULL
  ..$ tolerance   : num 1e-04
  ..$ polytime    : NULL
  ..$ splinetime  : NULL
  ..$ lags        : NULL
  ..$ leads       : NULL
  ..$ intercs     : logi FALSE
  ..$ sqrts       : NULL
  ..$ lgstc       : NULL
  ..$ noms        : NULL
  ..$ ords        : NULL
  ..$ priors      : NULL
  ..$ autopri     : num 0.05
  ..$ bounds      : NULL
  ..$ max.resample: num 100
  ..$ startvals   : num 0
  ..$ overimp     : NULL
  ..$ emburn      : num [1:2] 0 0
  ..- attr(*, "class")= chr [1:2] "ameliaArgs" "list"
 $ orig.vars  : chr [1:10] "year" "country" "tariff" "polity" ...
 - attr(*, "class")= chr "amelia"

From here you can see that the the "imputations" element of your a.out object contains your data frames, so you can reference each of your imputations from there. 从这里您可以看到a.out对象的“ imputations”元素包含您的数据帧,因此您可以从此处引用每个插补。 For example a.out$imputations[[1]]$year will give you the years from your first imputation. 例如, a.out$imputations[[1]]$year将为您提供从首次估算开始的年数。 If you like to do that across each imputation then you can do so using an apply function or loop. 如果您希望在每个插补中执行此操作,则可以使用Apply函数或循环执行此操作。 To illustrate this, consider: 为了说明这一点,请考虑:

> sapply(a.out$imputations,function(x) head(x$year))
     imp1 imp2 imp3 imp4 imp5
[1,] 1981 1981 1981 1981 1981
[2,] 1982 1982 1982 1982 1982
[3,] 1983 1983 1983 1983 1983
[4,] 1984 1984 1984 1984 1984
[5,] 1985 1985 1985 1985 1985
[6,] 1986 1986 1986 1986 1986

EDIT: I just re-read your question and I saw that you're actually looking for something more specific. 编辑:我只是重新阅读了您的问题,我看到您实际上正在寻找更具体的内容。 You can take what's above an apply it to make subsets of each each data frame doing something like lapply(a.out$imputations,function(x) x[x$year > 1990,]) . 您可以应用上面的内容,使每个数据帧的子集做类似lapply(a.out$imputations,function(x) x[x$year > 1990,]) I'm not sure how you would like to combine these imputed datasets (split by years great than/less than 1990), but if you just want to append all rows together rbind() will do the trick (if not let me know how you'd like to and I can probably recommend a solution): 我不确定您如何组合这些估算的数据集(由大于/小于1990的年份分割),但是如果您只想将所有行附加在一起,则rbind()会达到目的(如果不让我知道如何您想要的话,我可能会推荐一个解决方案):

> df1 <- do.call(rbind,lapply(a.out$imputations,function(x) x[x$year > 1990,]))
> df2 <- do.call(rbind,lapply(a.out$imputations,function(x) x[x$year < 1990,]))
> head(df1)
        year  country  tariff polity      pop   gdp.pc intresmi   signed fiveop     usheg
imp1.11 1991 SriLanka 26.9000      5 17247000 597.6987 2.285213 1.000000   12.8 0.2589872
imp1.12 1992 SriLanka 25.0000      5 17405000 618.3329 2.877877 0.515665   13.1 0.2623017
imp1.13 1993 SriLanka 24.2000      5 17628420 652.6205 4.280361 0.000000   13.2 0.2812928
imp1.14 1994 SriLanka 26.0000      5 17865000 680.0408 4.389912 0.000000   13.2 0.2783585
imp1.15 1995 SriLanka 20.0000      5 18112000 707.6591 3.995919 0.000000   13.2 0.2627195
imp1.16 1996 SriLanka 20.5646      5 18300000 727.0039 3.676763 0.000000   13.2 0.2681700
> head(df2)
       year  country   tariff polity      pop   gdp.pc intresmi signed fiveop     usheg
imp1.1 1981 SriLanka 30.56693      6 14988000 461.0236 1.937347      0   12.4 0.2593112
imp1.2 1982 SriLanka 22.39382      5 15189000 473.7634 1.964430      0   12.5 0.2558008
imp1.3 1983 SriLanka 41.30000      5 15417000 489.2266 1.663936      1   12.3 0.2655022
imp1.4 1984 SriLanka 26.81580      5 15599000 508.1739 2.797462      0   12.3 0.2988009
imp1.5 1985 SriLanka 31.00000      5 15837000 525.5609 2.259116      0   12.3 0.2952431
imp1.6 1986 SriLanka 17.76314      5 16117000 538.9237 1.832549      0   12.5 0.2886563

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 是否可以重新调整多重插补数据集中的变量(使用鼠标创建)? - Is it possible to relevel a variable within a multiply imputed dataset (created using mice)? R中多重估算数据集的多级回归模型(Amelia,zelig,lme4) - Multi-level regression model on multiply imputed data set in R (Amelia, zelig, lme4) 如何计算多重插补数据的 cronbach 的 alpha? - How do I calculate cronbach's alpha on multiply imputed data? 将Zelig“ sim”函数与Amelia数据集结合使用,以获取跨R中的估算数据集汇总的估计值 - Using Zelig “sim” function with Amelia dataset to obtain estimates pooled across imputed datasets in R 如何使用 Amelia 简单地估算 R 中的 NA 值,然后将数据集划分为 70:30 的数据集和训练集? - How can I simply impute NA values in R using Amelia and then divide the data set into a data and training set in a 70:30 split? 如何将 amelia 对象中的所有数据帧合并为一个数据帧? - How can I combine all dataframes in an amelia object into a single dataframe? 如何从Amelia包中提取完整的数据集 - How extract complete dataset from Amelia package 如何将多重插补数据与小鼠相结合? - How to combine multiply imputed data with mice? 如何使用 Amelia 进行多次插补后描述数据(我应该使用哪个数据集)? - How to describe data after multiple imputation using Amelia (which dataset should I use)? Zelig和Amelia的估算数据汇总统计 - Summary statistics for imputed data from Zelig & Amelia
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM