简体   繁体   English

在R中的矩阵中搜索对

[英]Searching pairs in matrix in R

I am rather new to R, so I would be grateful if anyone could help me :) 我对R很陌生,所以如果有人可以帮助我,我将不胜感激:)

I have a large matrices, for example: matrix and a vector of genes. 我有一个很大的矩阵,例如: 矩阵和基因载体。 My task is to search the matrix row by row and compile pairs of genes with mutations (on the matrix is D707H) with the rest of the genes contained in the vector and add it to a new matrix. 我的任务是逐行搜索矩阵,然后将包含突变的基因对(在矩阵上为D707H)与载体中包含的其余基因一起编译,并将其添加到新矩阵中。 I tried do this with loops but i have no idea how to write it correctly. 我试图用循环做到这一点,但我不知道如何正确地编写它。 For this matrix it should look sth like this: 对于这个矩阵,它看起来应该像这样:

    PR.02.1431    
    NBN BRCA1
    NBN BRCA2
    NBN CHEK2
    NBN ELAC2
    NBN MSR1
    NBN PARP1
    NBN RNASEL

Now i have sth like this: my idea 现在我有这样的事情: 我的想法

"a" is my initial matrix. “ a”是我的初始矩阵。

Can anyone point me in the right direction? 谁能指出我正确的方向? :) :)

Perhaps what you want/need is which(..., arr.ind = TRUE) . 也许您想要/需要的是which(..., arr.ind = TRUE)

Some sample data, for demonstration: 一些示例数据,以进行演示:

set.seed(2)
n <- 10
mtx <- array(NA, dim = c(n, n))
dimnames(mtx) <- list(letters[1:n], LETTERS[1:n])
mtx[sample(n*n, size = 4)] <- paste0("x", 1:4)
mtx
#   A  B    C  D  E  F    G    H  I  J 
# a NA NA   NA NA NA NA   NA   NA NA NA
# b NA NA   NA NA NA NA   NA   NA NA NA
# c NA NA   NA NA NA NA   NA   NA NA NA
# d NA NA   NA NA NA NA   NA   NA NA NA
# e NA NA   NA NA NA NA   NA   NA NA NA
# f NA NA   NA NA NA NA   NA   NA NA NA
# g NA "x4" NA NA NA "x3" NA   NA NA NA
# h NA NA   NA NA NA NA   NA   NA NA NA
# i NA "x1" NA NA NA NA   NA   NA NA NA
# j NA NA   NA NA NA NA   "x2" NA NA NA

In your case, it appears that you want anything that is not an NA or NaN . 在您的情况下,您似乎想要的不是NANaN You might try: 您可以尝试:

which(! is.na(mtx) & ! is.nan(mtx))
# [1] 17 19 57 70

but that isn't always intuitive when retrieving the row/column pairs (genes, I think?). 但这在检索行/列对时并不总是很直观(我想是基因吗?)。 Try instead: 请尝试:

ind <- which(! is.na(mtx) & ! is.nan(mtx), arr.ind = TRUE)
ind
#   row col
# g   7   2
# i   9   2
# g   7   6
# j  10   7

How to use this: the integers are row and column indices, respectively. 如何使用:整数分别是行索引和列索引。 Assuming your matrix is using row names and column names, you can retrieve the row names with: 假设矩阵使用行名和列名,则可以使用以下方法检索行名:

rownames(mtx)[ ind[,"row"] ]
# [1] "g" "i" "g" "j"

(An astute reader might suggest I use rownames(ind) instead. It certainly works!) Similarly for the colnames and "col" . (精明的读者可能会建议我改用rownames(ind) 。它确实可以工作!)对于colnames"col"

Interestingly enough, even though ind is a matrix itself, you can subset mtx fairly easily with: 有趣的是,即使ind本身就是一个矩阵,您也可以使用以下方法相当容易地对mtx进行子集化:

mtx[ind]
# [1] "x4" "x1" "x3" "x2"

Combining all three together, you might be able to use: 将所有三个结合在一起,您可能可以使用:

data.frame(
  gene1 = rownames(mtx)[ ind[,"row"] ],
  gene2 = colnames(mtx)[ ind[,"col"] ],
  val = mtx[ind]
)
#   gene1 gene2 val
# 1     g     B  x4
# 2     i     B  x1
# 3     g     F  x3
# 4     j     G  x2

I know where my misteke was, now i have matrix. 我知道我的老师在哪里,现在我有了矩阵。 Analyzing your code it works good, but that's not exactly what I want to do. 分析您的代码效果很好,但这并不是我想要的。 a, b, c, d etc. are organisms and row names are genes (A, B, C, D etc.). a,b,c,d等是生物,行名是基因(A,B,C,D等)。 I have to cobine pairs of genes where one of it (in the same column) has sth else than NA value. 我必须将其中一对(在同一列中)具有除NA值以外的其他基因配对。 For example if gene A has value=4 in column a I have to have: 例如,如果基因A在列a中的值为= 4,则我必须具有:

   gene1 gene2
a    A     B
a    A     C
a    A     D
a    A     E   

I tried in this way but number of elements do not match and i do not know how to solve this. 我以这种方式尝试过,但是元素数量不匹配,我也不知道该如何解决。

ind= which(! is.na(a) & ! is.nan(a), arr.ind = TRUE)
ind1=which(macierz==1,arr.ind = TRUE)
ramka= data.frame(
  kolumna = rownames(a)[ ind[,"row"] ],
  gene1 = colnames(a)[ ind[,"col"] ],
  gene2 = colnames(a)[ind1[,"col"]],
  #val = macierz[ind]
)

Do you know how to do this in R? 您知道如何在R中执行此操作吗?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM