简体   繁体   中英

How can I input missing sd in a dataframe and then enforce NAs on the column sd automatically as a function into a new data frame?

Here it is what I was trying to do:

Set up a dataframe:

df=data.frame(m=runif(500,0,100),n=round(runif(500,1,100),sd=runif(500,1,25))
head(df)
df$sd=as.data.frame(lapply(df[3],function(cc)cc[sample(c(TRUE,NA),prob=c(0.85,0.15),size=length(cc),replace=TRUE)]))

Assess if the SD in the data are missing:

NaS=which(is.na(df),arr.ind=TRUE)[,1]
NaM=noquote(paste0(NaS,sep=","))

Get the mean values from the df where the sd is missing, this is the clunky bit as I need to manually copy and paste the values of NaM here:

xm=df[c(...),1] xm

Get the n values from the df where the sd are missing:

xn=df[c(...),2]
xn

Make this a dataframe:

Simdf=data.frame(xm,xn)

Hopefully I am understanding you correctly but it seems you just want the m and n columns where is.na(df$sd) == TRUE ? I would just use subset for that:

df=data.frame(m=runif(500,0,100),n=round(runif(500,1,100)),sd=runif(500,1,25))
head(df)
df$sd=as.data.frame(lapply(df[3],function(cc)cc[sample(c(TRUE,NA),
                                                       prob=c(0.85,0.15),size=length(cc),
                                                       replace=TRUE)]))


df_NA <- subset(df, is.na(sd))

R> head(df_NA)
         m  n sd
8   0.8887 85 NA
20 86.1660 71 NA
26 46.9202 83 NA
48 84.4475 41 NA
51  4.8426  3 NA
53 61.7181 92 NA

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM