简体   繁体   中英

How to take Log2 of a matrix having negative values for Boxplot in R

I have a matrix having three columns, there is a lot of variation in the values, ranging from big positive to 0 to big negative values. For the sake of better representation of data I want to take log2 of all value but as it is not possible to take log2 of negative values and 0, I want to do following:

  1. If number = 0 then change it to 1 and take log2
  2. If number < 0 then take log2 of absolute value and assign the negative number to it
  3. If number > 0 then take log2 of the number

I am trying to do this with following code but no success so far:

Log2Transformed <- ifelse(df == 0, 1, log2(df) & ifelse(df < 0, -log2(abs(df)), log2(df)))

head(df)
     Open_TD Close_TD Invariant_TD
[1,]       1        6            5
[2,]       2        2            4
[3,]       0        0           -1
[4,]       0        0            2
[5,]       NA       0            2
[6,]       NA       0            1

Another way of doing this would be to make use of the $sign$ function, the 0 you would still need to replace in a seperate step eg

test <- rnorm(100)
abs_log <- function(x){
  x[x==0] <- 1
  si <- sign(x)
  si * log2(si*x)
}

boxplot(abs_log(test))

There are probably clever ways of doing this, but I would take my time and clearer define each step.

## Create dummy data
dd = data.frame(x = c(0, rnorm(100)))

First create a column for the transformed data

dd$trans = dd$x

Then gradually manipulate the column following your rules

#If number = 0 then change it to 1 and take log2
dd$trans[dd$x==0] = log2(1)
#If number < 0 then take log2 of absolute value 
# and assign the negative number to it
dd$trans[dd$x< 0] = -log2(abs(dd$x[dd$x <0]))
#If number > 0 then take log2 of the number
dd$trans[dd$x> 0] = log2(dd$x[dd$x >0])

Before plotting

boxplot(dd$trans)

I would create a function called trans_log2 that would automatically do this, eg

dd$x = trans_log2(dd$x)

let's do this constructively:

if x > 0 we log it.

if x == 0 we replace it with 1 then log.

if x < 0 we negate, then log, then negate again. that is, if we have a negative, say x= -y, y>0 the output should be -1*log(y) which is exactly the result of log(1/y) .

so we'd like to replace each negative x with 1/abs(x) while not hurting our positives. clearly abs(x) would not affect the positives, and the way of indicating the negatives is their sign, given by sign(x) . exponentiation by sign would replace only the negatives with their reciprocals.

all in all, our solution to the value substitution would be (abs(x))^(sign(x)) and then we can happily log2 , so we get:

Log2Transformed <- log2((abs(df))^(sign(df)))

for this input (based on your example):

  Open_TD Close_TD Invariant_TD
1     1.0        6            5
2     2.0        2            4
3   -32.0        0           -1
4    -0.5        0            2
5      NA        0            0
6      NA        0            1

we get the following output:

     Open_TD Close_TD Invariant_TD
[1,]       0 2.584963     2.321928
[2,]       1 1.000000     2.000000
[3,]      -5 0.000000     0.000000
[4,]       1 0.000000     1.000000
[5,]      NA 0.000000     0.000000
[6,]      NA 0.000000     0.000000

one-liner, no extra functions, no need to actually change the original data or create new dataframes and to top it all uses the matrix script which is typical to R and MatLab.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM