简体   繁体   English

使用 R 中的鼠标将 dataframe 转换为 mids

[英]Convert dataframe to mids using mice in R

I have use mice to impute data, save the data as csv, and then run a Factor Analysis in SPSS and generated some factors.我已经使用鼠标来估算数据,将数据保存为 csv,然后在 SPSS 中运行因子分析并生成一些因子。 I now want to load the csv in R and run an imputed linear regression on the data.我现在想在 R 中加载 csv 并对数据进行推算线性回归。 However, when I try to convert the dataframe to mids I get and error message saying:但是,当我尝试将 dataframe 转换为中频时,我收到错误消息:

library(mice)

# assign mtcars to a new dataframe
df <- mtcars

# loop 10 times
for (x in 1:10){
  
  # create a fake imp number
  a <- rep(x, 1, nrow(df))
  
  # bind the fake imp number to the df
  df2 <- cbind(df, a)
  
  # crate a 10 folded version of mtcars with also the fake imp number
  if (x ==1){
    new_df <- df2
  } else{
    new_df <- rbind(new_df, df2)
  }
}

# change the column name of the fake imp to ".imp"
names(new_df)[names(new_df) == 'a'] <- '.imp'

# convert df to mids
df_imp <- as.mids(new_df, .imp = .imp)

> Error in as.mids(df) : Original data not found. Use `complete(...,
> action = 'long', include = TRUE)` to save original data.

Can you please help me with this error?你能帮我解决这个错误吗?

Do tell if it helps, otherwise I will delete my answer...请告诉它是否有帮助,否则我将删除我的答案...

library(mice)
# EDIT ADDED THIS LINE
set.seed(42)
# assign mtcars to a new dataframe
df <- mtcars

# loop 10 times
for (x in 1:10){
  # create a fake imp number
  a <- rep(x, 1, nrow(df))
  # bind the fake imp number to the df
  df2 <- cbind(df, a)
  # crate a 10 folded version of mtcars with also the fake imp number
  if (x ==1){
    new_df <- df2
  } else{
    new_df <- rbind(new_df, df2)
  }
}

names(new_df)[names(new_df) == 'a'] <- '.imp'
new_df <- mice(new_df)
new_df <- complete(new_df, action="long",include=TRUE)

df_imp <- as.mids(new_df)

From the as.mids() documentation .来自as.mids() 文档

This function converts imputed data stored in long format into an object of class mids.此 function 将以长格式存储的估算数据转换为 object 的 class mids。 The original incomplete dataset needs to be available so that we know where the missing data are .原始的不完整数据集需要可用,以便我们知道丢失的数据在哪里 The function is useful to convert back operations applied to the imputed data back in a mids object. function 可用于将应用于推算数据的反向操作转换回 object 中部。 It may also be used to store multiply imputed data sets from other software into the format used by mice.它还可用于将来自其他软件的多重估算数据集存储为鼠标使用的格式。

The incomplete data is stored as imputation 0 in the long format.不完整的数据以长格式存储为插补 0。 Therefore starting your procedure at 0 instead of 1 resolves the issue.因此,从 0 而不是 1 开始您的过程可以解决问题。 (Also, you need quotes around .imp = '.imp' in the as.mids() call. Or, remove it and rely on the default. Or, just supply "a" as the imputation variable.) (此外,您需要在as.mids()调用中为.imp = '.imp'加上引号。或者,删除它并依赖默认值。或者,只需提供“a”作为插补变量。)

library(mice)
df <- mtcars
for (x in 0:10){
  a <- rep(x, 1, nrow(df))
  df2 <- cbind(df, a)
  if (x == 0){
    new_df <- df2
  } else{
    new_df <- rbind(new_df, df2)
  }
}
names(new_df)[names(new_df) == 'a'] <- '.imp'
df_imp <- as.mids(new_df)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM