简体   繁体   中英

subsetting matrix including NA's

I have a matrix like so:

     a    b    c    d
[1]  as   ac   ad   ae
[2]  bd   bf   bg   bh
[3]  NA   cf   cd   ce
[4]  NA   NA   dr   dy
[5]  NA   NA   NA   ej 

I would like to subset every column separately into a matrix or list based on 50% of the observations, so I would like my output to look like this:

     a    b    c    d
[1]  as   ac   ad   ae
[2]  NA   bf   bg   bh
[3]  NA   NA   NA   ce

So far I have used to code for separate columns without NA's.

mv.s <- subset(mv, mv <= quantile(mv, 0.5))    

now I was thinking of using something like

for (i in 1:15) {
mv.s[[i]] <- subset(mv[[i]], mv <= quantile(mv, 0.5))
}

However, when I do this I get the warning:

Error in quantile.default(mv, 0.5) : missing values and NaN's not allowed if 'na.rm' is FALSE

when I try this code:

for (i in 1:15) {
mv.s[[i]] <- subset(mv[[i]], mv <= quantile(mv[[i]], 0.5))
}

I get

Error in (1 - h) * qs[i] : non-numeric argument to binary operator

Any help would be appreciated.

Without using any package and just the apply function you could do the following.

apply(mat, 2, FUN = function(x){ sample(x, ceiling(length(x)/2), replace = FALSE)})

That takes a random sample of your observations per column without replacement and assumes that your matrix is called mat .

If you use set.seed(1) to make the random sample reproducible the result will look like this.

     [,1] [,2] [,3] [,4]
[1,] "bd" NA   NA   "ae"
[2,] NA   "ac" "cd" "ej"
[3,] NA   "cf" "bg" "dy"

The sample_frac() function in dplyr sounds like it fits your needs.

install.packages('dplyr')
library(dplyr)

subset_matrix <- apply(mv, 2, function(x) sample_frac(x, .5, replace = F))

You can specify which fraction of rows you want sampled in sample_frac() . Using apply() column-wise will give you that fraction of observations for each column.

I did not test this because you didn't provide a sample of your data, but it looks like it should work.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM