简体   繁体   中英

Is there a way to ignore NA values in a sample function in R?

I would like to randomly select two non-repeating values from each row of my dataframe and insert these values into two columns at the end of the dataframe at the same row. I'm using the sample , the problem though is that there is some missing data. I would like to find a way to use sample ignoring the missing data. I tried to specify the na.rm command, but it is not working. What can I do?

Let a vector be x like this

x <- c(NA, 3, 4, 5, NA)

Now subset x with its non NA values only and sample on that subset.

sample(x[!is.na(x)], 1)

Suppose we have the following data.frame:

set.seed(3)
data <- as.data.frame(matrix(sample(c(1:30,rep(NA,20)),replace = TRUE,size = 24),ncol = 3))
data
  V1 V2 V3
1  5 20 29
2 12 10 NA
3 NA NA NA
4 NA NA  5
5 NA NA NA
6 NA  8 NA
7 NA NA  9
8  8  2  9

We can see there are sometimes when there are enough values to sample, but other times not. To get around these edge cases, we can write a custom function:

sample.function <- function(x){
  if(sum(!is.na(x)) == 0) {c(NA,NA)}
  else if(sum(!is.na(x)) == 1) {c(x[!is.na(x)],NA)}
  else {sample(x[!is.na(x)],size = 2)}}

If there are no non-NA values, the function returns c(NA,NA) . If there is only one non-NA value, it returns that value and NA. If there are two or more, it uses the sample function on x which is subset to not include any NA values.

Then we can use the apply function to apply our custom sample.function to our data. Apply binds the results column wise, so we can transpose it with t() .

t(apply(data,1,sample.function))
     [,1] [,2]
[1,]   20   29
[2,]   10   12
[3,]   NA   NA
[4,]    5   NA
[5,]   NA   NA
[6,]    8   NA
[7,]    9   NA
[8,]    2    9

Now add it to the original data:

setNames(cbind(data,t(apply(data,1,sample.function))),c("V1","V2","V3","Sample1","Sample2"))
  V1 V2 V3 Sample1 Sample2
1  5 20 29       5      29
2 12 10 NA      12      10
3 NA NA NA      NA      NA
4 NA NA  5       5      NA
5 NA NA NA      NA      NA
6 NA  8 NA       8      NA
7 NA NA  9       9      NA
8  8  2  9       9       8

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM