简体   繁体   中英

Create a -1, 0, 1 matrix from two data frames

I would like to create a matrix with the values -1, 0, and 1 for expression data. The issue I'm encountering is the data is in two separate data frames, one contains over-expression and the other under-expressed. I would like to combine them into the same column with over-expressed terms showing a "1", under-expression a "-1", and no change a "0".

>over
0.09
0.08
0.02
0.10
0.07
>under
0.07
0.03
0.06
0.01
0.02

So I would like a matrix that gives a 1 for <0.05 in over and a -1 for <0.05 in under :

>new
0
-1
1
-1
-1

Tried a couple different things, but keep hitting walls and haven't been able to find anything specific in the form of a similar question.

It's just a couple of basic assignments:

# recreate your data
over  <- c(0.09,0.08,0.02,0.10,0.07)
under <- c(0.07,0.03,0.06,0.01,0.02)

out <- vector("numeric",5)
out[over  < 0.05] <-  1
out[under < 0.05] <- -1
out
#[1]  0 -1  1 -1 -1

Or shorthand using interaction to check multiple conditions. Which has the added advantage of dealing with cases that meet both criteria and labelling them as such. It also allows arbitrary labelling.

c(0,1,-1,2)[interaction(over < 0.05, under < 0.05)]
#[1]  0 -1  1 -1 -1

You can just directly use the comparisons on each data frame themselves and treat them as numeric. This will result in only 0 or 1 values.

mat <- as.matrix(as.numeric(df1$over < 0.05) -
       as.numeric(df2$under < 0.05))
> mat
      [,1]
[1,]    0
[2,]   -1
[3,]    1
[4,]   -1
[5,]   -1

Data:

df1 <- data.frame(over=c(0.09, 0.08, 0.02, 0.10, 0.07))
df2 <- data.frame(under=c(0.07, 0.03, 0.06, 0.01, 0.02))

Demo here:

Rextester

I'm sure there's a more elegant way than this but you could bind the columns together, create a new column filled with 0s, test for the "over" and "under" conditions then convert the new column to a matrix, all using dplyr . Of course if both conditions can be true then the second test will overwrite the result of the first.

library(dplyr)
new <- over %>%
  bind_cols(under) %>%
  mutate(new = 0) %>%
  mutate(new = ifelse(over < 0.05, 1, new)) %>%
  mutate(new = ifelse(under < 0.05, -1, new)) %>%
  select(new) %>%
  as.matrix()

new
     new
[1,]   0
[2,]  -1
[3,]   1
[4,]  -1
[5,]  -1

We can also do this without converting to numeric

new <- (df1$over < 0.05) - (df2$under < 0.05)
dim(new) <- dim(df1)
new
#      [,1]
#[1,]    0
#[2,]   -1
#[3,]    1
#[4,]   -1
#[5,]   -1

Or another option is

matrix(Reduce(`-`, lapply(cbind(df1, df2), `<`, 0.05)))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM