简体   繁体   中英

R matrix apply function with

I have following example matrix x.

x <- data.frame(c1=c(1,2,3,2,1,3),
                    c2=c(4,5,6,2,3,4),
                    c3=c(7,8,9,7,1,6),
                    c4=c(4,0,9,1,5,0),
                    c5=c(3,8,0,7,3,6),
                    c6=c(2,8,5,0,5,7),
                    row.names = c("r1","r2","r3","r4","r5","r6"))

I need to apply function f to each column where cMin is the column minimum and cMax is the column maximum vectors.

cMax <- colMaxs(mat)
cMin <- colMins(mat)

I am trying to use apply function apply(mat,2,f) as shown below but getting warnings and the result is not correct as well.

f <- function(x) (x - cMin[])/(cMax - cMin)

warnings: Warning messages:

1: In x - cMin[] :
  longer object length is not a multiple of shorter object length
2: In (x - cMin[])/(cMax - cMin) :
  longer object length is not a multiple of shorter object length
3: In x - cMin[] :
  longer object length is not a multiple of shorter object length
4: In (x - cMin[])/(cMax - cMin) :
  longer object length is not a multiple of shorter object length

Can someone explain how to use the apply function consisting a vector (cMin or cMax)?

When subtracting a vector from a matrix, the vector is aligned by columns due to the storage mechanism of a matrix and the recycling rule; So you can transpose the matrix , do the calculations with cMin , cMax and then transpose it back:

t((t(mat) - cMin)/(cMax - cMin))

#    c1   c2    c3        c4    c5    c6
#r1 0.0 0.50 0.750 0.4444444 0.375 0.250
#r2 0.5 0.75 0.875 0.0000000 1.000 1.000
#r3 1.0 1.00 1.000 1.0000000 0.000 0.625
#r4 0.5 0.00 0.750 0.1111111 0.875 0.000
#r5 0.0 0.25 0.000 0.5555556 0.375 0.625
#r6 1.0 0.50 0.625 0.0000000 0.750 0.875
library(magrittr)
x <- data.frame(c1=c(1,2,3,2,1,3),
                c2=c(4,5,6,2,3,4),
                c3=c(7,8,9,7,1,6),
                c4=c(4,0,9,1,5,0),
                c5=c(3,8,0,7,3,6),
                c6=c(2,8,5,0,5,7),
                row.names = c("r1","r2","r3","r4","r5","r6"))

cMin <- apply(x, MARGIN = 2, FUN = min)
cMax <- apply(x, MARGIN = 2, FUN = max)

sweep(x, MARGIN = 2, STATS = cMin, FUN = "-") %>%
  sweep(., MARGIN = 2, STATS = (cMax - cMin), FUN = "/")

    c1   c2    c3        c4    c5    c6
r1 0.0 0.50 0.750 0.4444444 0.375 0.250
r2 0.5 0.75 0.875 0.0000000 1.000 1.000
r3 1.0 1.00 1.000 1.0000000 0.000 0.625
r4 0.5 0.00 0.750 0.1111111 0.875 0.000
r5 0.0 0.25 0.000 0.5555556 0.375 0.625
r6 1.0 0.50 0.625 0.0000000 0.750 0.875

As I see from the solutions, the aim it so scale each column to range 0 to 1, linearly, with the smallest value mapping to 0 and maximum to 1.

In like one line, without having to calculate cMin and cMax

apply(x, 2, 
      function(each_col) (each_col - min(each_col))/diff(range(each_col)))

# c1   c2    c3        c4    c5    c6
# r1 0.0 0.50 0.750 0.4444444 0.375 0.250
# r2 0.5 0.75 0.875 0.0000000 1.000 1.000
# r3 1.0 1.00 1.000 1.0000000 0.000 0.625
# r4 0.5 0.00 0.750 0.1111111 0.875 0.000
# r5 0.0 0.25 0.000 0.5555556 0.375 0.625
# r6 1.0 0.50 0.625 0.0000000 0.750 0.875

We can just replicate the 'cMin' and 'cMax' and do the calculation

(mat - cMin[col(mat)])/(cMax[col(mat)] - cMin[col(mat)])
#    c1   c2    c3        c4    c5    c6
#r1 0.0 0.50 0.750 0.4444444 0.375 0.250
#r2 0.5 0.75 0.875 0.0000000 1.000 1.000
#r3 1.0 1.00 1.000 1.0000000 0.000 0.625
#r4 0.5 0.00 0.750 0.1111111 0.875 0.000
#r5 0.0 0.25 0.000 0.5555556 0.375 0.625
#r6 1.0 0.50 0.625 0.0000000 0.750 0.875

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM