简体   繁体   English

R — 在小鼠 package 中为插补 model 选择预测变量子集时出错

[英]R — Error when selecting subset of predictors for an imputation model in mice package

I am attempting to create a multiple imputation model using the mice package in R.我正在尝试使用 R 中的小鼠 package 创建多重插补 model。 Here are some details about what I am specifically trying to do, and below is a subset of the data I am working with, the code I have tried, and the error that I am getting.以下是有关我具体尝试执行的操作的一些详细信息,以下是我正在使用的数据的子集、我尝试过的代码以及我遇到的错误。

In the dataset, labelled 'mturk.all', there are ~300 variables and around 1300 cases.在标记为“mturk.all”的数据集中,有大约 300 个变量和大约 1300 个案例。 For the multiple imputation model, I am trying to use only 33 of the ~300 variables;对于多重插补 model,我尝试仅使用约 300 个变量中的 33 个; these 33 variables are dichotomous (coded as 0 and 1 in the dataset).这 33 个变量是二分的(在数据集中编码为 0 和 1)。

Following the mice code provided at https://stefvanbuuren.name/fimd/sec-toomany.html (see section 9.1.6 on the linked website), I have tried the following code, which resulted in an error (also shown below):按照https://stefvanbuuren.name/fimd/sec-toomany.html提供的鼠标代码(参见链接网站上的第 9.1.6 节),我尝试了以下代码,导致错误(如下所示) :

>library(mice)
>pred <- quickpred(mturk.all, mincor = .1, minpuc = 0, inc=c("TSHS_1R", "TSHS_2R", "TSHS_3R", "TSHS_4R", "TSHS_5R", "TSHS_6R", "TSHS_7R", "TSHS_8R", "TSHS_9R", "TSHS_10R", "TSHS_11R", "TSHS_12R", "TSHS_13R", "TSHS_14R", "TSHS_15R", "TSHS_16R", "TSHS_17R", "TSHS_18R", "TSHS_19R", "TSHS_20R", "TSHS_21R", "TSHS_22R", "TSHS_23R", "TSHS_24R", "TSHS_25R", "TSHS_26R", "TSHS_27R", "TSHS_28R", "TSHS_29R", "TSHS_30R", "TSHS_31R", "TSHS_32R", "TSHS_33R"))
There were 14 warnings (use warnings() to see them)
> warnings()
Warning messages:
1: In data.matrix(data) : NAs introduced by coercion
2: In data.matrix(data) : NAs introduced by coercion
3: In data.matrix(data) : NAs introduced by coercion
4: In data.matrix(data) : NAs introduced by coercion
5: In data.matrix(data) : NAs introduced by coercion
6: In data.matrix(data) : NAs introduced by coercion
7: In data.matrix(data) : NAs introduced by coercion
8: In data.matrix(data) : NAs introduced by coercion
9: In data.matrix(data) : NAs introduced by coercion
10: In data.matrix(data) : NAs introduced by coercion
11: In data.matrix(data) : NAs introduced by coercion
12: In data.matrix(data) : NAs introduced by coercion
13: In data.matrix(data) : NAs introduced by coercion
14: In data.matrix(data) : NAs introduced by coercion
>mturk.all.imp <- mice(mturk.all, m = 40, method = 'logreg', pred = pred)
Error in parse(text = x, keep.source = FALSE) : 
  <text>:1:1: unexpected '<'
1: <
    ^

Alternatively, I tried the following:或者,我尝试了以下方法:

>inlist <- mturk.all[c("TSHS_1R", "TSHS_2R", "TSHS_3R", "TSHS_4R", "TSHS_5R", "TSHS_6R", "TSHS_7R", "TSHS_8R", "TSHS_9R", "TSHS_10R", "TSHS_11R", "TSHS_12R", "TSHS_13R", "TSHS_14R", "TSHS_15R", "TSHS_16R", "TSHS_17R", "TSHS_18R", "TSHS_19R", "TSHS_20R", "TSHS_21R", "TSHS_22R", "TSHS_23R", "TSHS_24R", "TSHS_25R", "TSHS_26R", "TSHS_27R", "TSHS_28R", "TSHS_29R", "TSHS_30R", "TSHS_31R", "TSHS_32R", "TSHS_33R")]
>pred <- quickpred(mturk.all, mincor = .1, minpuc = 0, inc=inlist)
>mturk.all.imp <- mice(mturk.all, m = 40, method = 'logreg', pred = pred)
Error in parse(text = x, keep.source = FALSE) : 
  <text>:1:1: unexpected '<'
1: <
    ^

I have also switched out the 'logreg' imputation method with 'pmm' and received the same error message.我还用“pmm”切换了“logreg”插补方法,并收到了相同的错误消息。

Here is a subset of the dataset, mturk.all, and the version of R Studio I am using, plus the version of mice I am using.这是数据集 mturk.all 的子集,以及我正在使用的 R Studio 版本,以及我正在使用的鼠标版本。

> dput(mturk.all[425:434, 1:33])
structure(list(TSHS_1R = c(0, 1, 1, 0, 0, 0, 1, 1, 0, 1), TSHS_2R = c(0, 
0, 0, 1, 0, 0, 0, 0, 0, 0), TSHS_3R = c(0, 1, 0, 0, 0, 0, 0, 
0, 1, 0), TSHS_4R = c(0, 1, 0, 0, 0, 0, 1, 0, 1, 0), TSHS_5R = c(0, 
0, 0, 1, 0, 0, 0, 0, 1, 0), TSHS_6R = c(0, 0, NA, NA, 0, 0, 1, 
0, 0, 0), TSHS_7R = c(0, 0, 0, 1, 0, 1, 1, 0, 0, 0), TSHS_8R = c(1, 
1, 0, 0, 1, 1, 1, 1, 0, 1), TSHS_9R = c(0, 0, 0, 0, 0, 0, 0, 
1, 1, 0), TSHS_10R = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), TSHS_11R = c(0, 
0, NA, 0, 0, 0, 0, 1, 0, 0), TSHS_12R = c(1, 0, 1, 0, 1, 0, 0, 
0, 0, 0), TSHS_13R = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 1), TSHS_14R = c(0, 
0, 0, 0, 0, 0, 0, 0, 0, 0), TSHS_15R = c(0, 0, 1, 0, 0, 0, 0, 
0, 0, 0), TSHS_16R = c(0, 0, 1, 0, 0, 0, 0, 0, 0, 0), TSHS_17R = c(0, 
0, 0, 0, 0, 0, 0, 0, 0, 0), TSHS_18R = c(0, 0, 0, 0, 1, 0, 0, 
0, 0, 0), TSHS_19R = c(0, 0, 0, NA, 0, 0, 0, 0, 0, 0), TSHS_20R = c(0, 
0, 0, 1, 0, 0, 0, 0, 0, 0), TSHS_21R = c(0, 0, 0, 0, 1, 0, 0, 
0, 0, 0), TSHS_22R = c(0, 1, 0, 0, NA, 1, 0, 0, 0, 0), TSHS_23R = c(0, 
0, 0, 0, NA, 1, 0, 0, 0, 1), TSHS_24R = c(0, 0, 0, 1, NA, 1, 
0, 1, 1, 0), TSHS_25R = c(1, 1, 1, 0, 1, 1, 1, 1, 1, 1), TSHS_26R = c(1, 
0, 1, 0, 0, 0, 0, 1, 0, 1), TSHS_27R = c(1, 0, 0, 1, 0, 1, 1, 
0, 0, 1), TSHS_28R = c(0, 0, 0, 0, 0, 0, 0, 0, 1, 0), TSHS_29R = c(1, 
0, 0, 0, 1, 0, 0, 0, 0, 0), TSHS_30R = c(0, 0, 0, 0, 0, 0, 0, 
0, 0, 0), TSHS_31R = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), TSHS_32R = c(0, 
0, 0, 0, 0, 0, 0, 0, 0, 0), TSHS_33R = c(0, 0, 0, 0, 0, 0, 0, 
0, 0, 0)), row.names = c(NA, -10L), class = c("tbl_df", "tbl", 
"data.frame"))
> rstudioapi::versionInfo()
$`citation`

To cite RStudio in publications use:

  RStudio Team (2018). RStudio: Integrated Development for R. RStudio, Inc., Boston, MA URL http://www.rstudio.com/.

A BibTeX entry for LaTeX users is

  @Manual{,
    title = {RStudio: Integrated Development Environment for R},
    author = {{RStudio Team}},
    organization = {RStudio, Inc.},
    address = {Boston, MA},
    year = {2018},
    url = {http://www.rstudio.com/},
  }


$`mode`
[1] "desktop"

$version
[1] ‘1.2.1335’

> packageVersion("mice")
[1] ‘3.8.0’

Any help identifying what I am doing wrong in the code would be appreciated!任何帮助确定我在代码中做错了什么都将不胜感激!

Ian伊恩

There are some you may want to try:您可能想尝试一些:

  1. Try traceback() to see more recent calls for further troubleshoot.尝试traceback()查看更多最近的调用以进行进一步的故障排除。
  2. Try updating your R version and the mice package.尝试更新您的 R 版本和mice package。
  3. Try with different inputs.尝试不同的输入。

I think you can try as.data.frame(mturk.all) since the quickpred and mice may only takes in dataframe and matrix.我认为您可以尝试as.data.frame(mturk.all)因为quickpred和 mouse 可能mice接受 dataframe 和矩阵。

Sorry for the late reply, I cannot reproduce the problem.抱歉回复晚了,我无法重现该问题。 A few more suggestion I have;我还有一些建议;

  1. Check your encoding of the data file.检查数据文件的编码。 You should save with UTF-8 encoding (with notpad++,...).您应该使用 UTF-8 编码(使用 notpad++,...)保存。
  2. Also check the data for any error.还要检查数据是否有任何错误。
  3. Open issue in the package repository, you may get an answer there.在 package 存储库中打开问题,您可能会在那里得到答案。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM