简体   繁体   中英

how to let a matrix minus vector by row rather than by column

I want to get the data that each column minus its mean. First I count the mean of each column There is my data bellow called m

         angel distance
     [1,]   1.3     0.43
     [2,]   4.0     0.84
     [3,]   2.7     0.58
     [4,]   2.2     0.58
     [5,]   3.6     0.70
     [6,]   4.9     1.00
     [7,]   0.9     0.27
     [8,]   1.1     0.29
     [9,]   3.1     0.63

> mean<-apply(m,2,FUN=mean)

        angel  distance 
    2.6444444 0.5911111 

> m-mean
        angel    distance
1 -1.34444444 -0.16111111
2  3.40888889 -1.80444444
3  0.05555556 -0.01111111
4  1.60888889 -2.06444444
5  0.95555556  0.10888889
6  4.30888889 -1.64444444
7 -1.74444444 -0.32111111
8  0.50888889 -2.35444444
9  0.45555556  0.03888889

So the final answer is got through minus mean by column. I want it minus by each row. How can I get this?

First, let's use colMeans(m) to get column means of matrix m . Then we use sweep :

sweep(m, 2, colMeans(m))

where 2 specifies margin (we want column-wise operation, and in 2D index, the second index is for column). By default, sweep performs FUN = "-" , so in above we are subtracting column means from the matrix, ie, centring the matrix.

Similarly if we want to subtract row means from all rows, we can use:

sweep(m, 1, rowMeans(m))

You can set FUN argument to other functions, too. Another common use of sweep is for column / row rescaling, where you can read How to rescale my matrix by column or row for more.


Function scale mentioned by the other answer is used only for column-wise operation. A common use is to standardised all matrix columns. We can set scale = FALSE to perform column centring only.

scale is just a wrapper function of sweep which you can verify by inspecting the source code of sweep.default :

if (center) {
            center <- colMeans(x, na.rm = TRUE)
            x <- sweep(x, 2L, center, check.margin = FALSE)
        }

if (scale) {
     scale <- apply(x, 2L, f)
        x <- sweep(x, 2L, scale, "/", check.margin = FALSE)
        }

Read ?sweep , ?scale , ?colMeans for more on those functions.

You can get the same by this (z-score normalization without scaling):

scale(df, scale=FALSE)

    angel    distance
[1,] -1.34444444 -0.16111111
[2,]  1.35555556  0.24888889
[3,]  0.05555556 -0.01111111
[4,] -0.44444444 -0.01111111
[5,]  0.95555556  0.10888889
[6,]  2.25555556  0.40888889
[7,] -1.74444444 -0.32111111
[8,] -1.54444444 -0.30111111
[9,]  0.45555556  0.03888889

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM