简体   繁体   中英

Removing a column of data from a matrix based on standard deviation of the row in R

I am trying to subset a large data matrix, an example of which is below:

                row 1/col 1         row 1/col 2         row 1/col 3
 [1,]             855.815             749.574             754.950
 [2,]             855.718             749.496             755.004
 [3,]             855.846             749.359             754.910
 [4,]             855.746             749.299             754.795
 [5,]             855.805             749.421             754.883

I am trying to remove columns where the value of the first row is above or below one standard deviation away from the mean of the whole first row, using this code:

library(matrixStats)
x = data[,-1] > (rowMeans(data[,-1]) + rowSds(data[,-1]))
y = data[,-1] < (rowMeans(data[,-1]) - rowSds(data[,-1]))
subset(df2, !(x | y))

But this returns the following error when applied to my dataset:

Error in x[subset & !is.na(subset), vars, drop = drop] : 
  (subscript) logical subscript too long

As I understand it, R has expanded this to read:

subset(df2, !(data[,-1] > (rowMeans(data[,-1]) + rowSds(data[,-1]))|data[,-1] < (rowMeans(data[,-1]) - rowSds(data[,-1]))))

and that the logical argument is simply too long. Is there something I am missing? I am inexperienced with R and sure there are neater ways to do this, but from what I have read I thought subset would be most useful.

Thank you in advance.

You can try this:

df <- as.matrix(read.table(text='C1 C2 C3
                 [1,]             855.815             749.574             754.950
                 [2,]             855.718             749.496             755.004
                 [3,]             855.846             749.359             754.910
                 [4,]             855.746             749.299             754.795
                 [5,]             855.805             749.421             754.883', header=TRUE))

library(matrixStats)
df[,which(abs(df[1,] - rowMeans(df)[1]) < rowSds(df)[1])]

#      C2      C3
#[1,] 749.574 754.950
#[2,] 749.496 755.004
#[3,] 749.359 754.910
#[4,] 749.299 754.795
#[5,] 749.421 754.883

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM