[英]How to transform matrix into numeric matrix?
I have made a scoring matrix in text file for local alignment with pairwiseAlignment function. 我在文本文件中制作了一个得分矩阵,用于使用pairwiseAlignment函数进行局部对齐。 Then I used this function to input it into the R:
然后,我使用此函数将其输入到R中:
ex <- as.matrix(read.table("~/scoringMatrix", header=FALSE, sep = "\t", row.names = 1, as.is=TRUE)
Format is this: 格式是这样的:
> ex
A T C G
A 5 -2 -1 -2
T -2 7 -1 -2
C -1 -1 7 2
G -2 -2 2 8
Now whenever I use pairwiseAlignment
function I get following error: 现在,每当我使用
pairwiseAlignment
函数时,都会出现以下错误:
pairwiseAlignment(x[[1]], x[[2]], substitutionMatrix = ex, gapOpening = -2, gapExtension = -8, scoreOnly = FALSE)
Error in XStringSet.pairwiseAlignment(pattern = pattern, subject = subject, :
'substitutionMatrix' must be a numeric matrix
If I would use already existing substitution matrix like BLOSUM50, it works perfectly. 如果我要使用像BLOSUM50这样已经存在的替换矩阵,它就可以完美地工作。 So how do I make this matrix suitable for pairwiseAlignment?
那么如何使这个矩阵适合pairwiseAlignment?
> dput(ex)
structure(logical(0), .Dim = c(5L, 0L), .Dimnames = list(c(" A T C G",
"A 5 -2 -1 -2", "T -2 7 -1 -2", "C -1 -1 7 2", "G -2 -2 2 8"
), NULL))
While dput(BLOSUM50)
looks completely different: 虽然
dput(BLOSUM50)
看起来完全不同:
> dput(BLOSUM50)
structure(c(5L, -2L, -1L, -2L, -1L, -1L, -1L, 0L, -2L, -1L, -2L,
-1L, -1L, -3L, -1L, 1L, 0L, -3L, -2L, 0L, -2L, -1L, -1L, -5L,
-2L, 7L, -1L, -2L, -4L, 1L, 0L, -3L, 0L, -4L, -3L, 3L, -2L, -3L,
-3L, -1L, -1L, -3L, -1L, -3L, -1L, 0L, -1L, -5L, -1L, -1L, 7L,
2L, -2L, 0L, 0L, 0L, 1L, -3L, -4L, 0L, -2L, -4L, -2L, 1L, 0L,
-4L, -2L, -3L, 4L, 0L, -1L, -5L, -2L, -2L, 2L, 8L, -4L, 0L, 2L,
-1L, -1L, -4L, -4L, -1L, -4L, -5L, -1L, 0L, -1L, -5L, -3L, -4L,
5L, 1L, -1L, -5L, -1L, -4L, -2L, -4L, 13L, -3L, -3L, -3L, -3L,
-2L, -2L, -3L, -2L, -2L, -4L, -1L, -1L, -5L, -3L, -1L, -3L, -3L,
-2L, -5L, -1L, 1L, 0L, 0L, -3L, 7L, 2L, -2L, 1L, -3L, -2L, 2L,
0L, -4L, -1L, 0L, -1L, -1L, -1L, -3L, 0L, 4L, -1L, -5L, -1L,
0L, 0L, 2L, -3L, 2L, 6L, -3L, 0L, -4L, -3L, 1L, -2L, -3L, -1L,
-1L, -1L, -3L, -2L, -3L, 1L, 5L, -1L, -5L, 0L, -3L, 0L, -1L,
-3L, -2L, -3L, 8L, -2L, -4L, -4L, -2L, -3L, -4L, -2L, 0L, -2L,
-3L, -3L, -4L, -1L, -2L, -2L, -5L, -2L, 0L, 1L, -1L, -3L, 1L,
0L, -2L, 10L, -4L, -3L, 0L, -1L, -1L, -2L, -1L, -2L, -3L, 2L,
-4L, 0L, 0L, -1L, -5L, -1L, -4L, -3L, -4L, -2L, -3L, -4L, -4L,
-4L, 5L, 2L, -3L, 2L, 0L, -3L, -3L, -1L, -3L, -1L, 4L, -4L, -3L,
-1L, -5L, -2L, -3L, -4L, -4L, -2L, -2L, -3L, -4L, -3L, 2L, 5L,
-3L, 3L, 1L, -4L, -3L, -1L, -2L, -1L, 1L, -4L, -3L, -1L, -5L,
-1L, 3L, 0L, -1L, -3L, 2L, 1L, -2L, 0L, -3L, -3L, 6L, -2L, -4L,
-1L, 0L, -1L, -3L, -2L, -3L, 0L, 1L, -1L, -5L, -1L, -2L, -2L,
-4L, -2L, 0L, -2L, -3L, -1L, 2L, 3L, -2L, 7L, 0L, -3L, -2L, -1L,
-1L, 0L, 1L, -3L, -1L, -1L, -5L, -3L, -3L, -4L, -5L, -2L, -4L,
-3L, -4L, -1L, 0L, 1L, -4L, 0L, 8L, -4L, -3L, -2L, 1L, 4L, -1L,
-4L, -4L, -2L, -5L, -1L, -3L, -2L, -1L, -4L, -1L, -1L, -2L, -2L,
-3L, -4L, -1L, -3L, -4L, 10L, -1L, -1L, -4L, -3L, -3L, -2L, -1L,
-2L, -5L, 1L, -1L, 1L, 0L, -1L, 0L, -1L, 0L, -1L, -3L, -3L, 0L,
-2L, -3L, -1L, 5L, 2L, -4L, -2L, -2L, 0L, 0L, -1L, -5L, 0L, -1L,
0L, -1L, -1L, -1L, -1L, -2L, -2L, -1L, -1L, -1L, -1L, -2L, -1L,
2L, 5L, -3L, -2L, 0L, 0L, -1L, 0L, -5L, -3L, -3L, -4L, -5L, -5L,
-1L, -3L, -3L, -3L, -3L, -2L, -3L, -1L, 1L, -4L, -4L, -3L, 15L,
2L, -3L, -5L, -2L, -3L, -5L, -2L, -1L, -2L, -3L, -3L, -1L, -2L,
-3L, 2L, -1L, -1L, -2L, 0L, 4L, -3L, -2L, -2L, 2L, 8L, -1L, -3L,
-2L, -1L, -5L, 0L, -3L, -3L, -4L, -1L, -3L, -3L, -4L, -4L, 4L,
1L, -3L, 1L, -1L, -3L, -2L, 0L, -3L, -1L, 5L, -4L, -3L, -1L,
-5L, -2L, -1L, 4L, 5L, -3L, 0L, 1L, -1L, 0L, -4L, -4L, 0L, -3L,
-4L, -2L, 0L, 0L, -5L, -3L, -4L, 5L, 2L, -1L, -5L, -1L, 0L, 0L,
1L, -3L, 4L, 5L, -2L, 0L, -3L, -3L, 1L, -1L, -4L, -1L, 0L, -1L,
-2L, -2L, -3L, 2L, 5L, -1L, -5L, -1L, -1L, -1L, -1L, -2L, -1L,
-1L, -2L, -1L, -1L, -1L, -1L, -1L, -2L, -2L, -1L, 0L, -3L, -1L,
-1L, -1L, -1L, -1L, -5L, -5L, -5L, -5L, -5L, -5L, -5L, -5L, -5L,
-5L, -5L, -5L, -5L, -5L, -5L, -5L, -5L, -5L, -5L, -5L, -5L, -5L,
-5L, -5L, 1L), .Dim = c(24L, 24L), .Dimnames = list(c("A", "R",
"N", "D", "C", "Q", "E", "G", "H", "I", "L", "K", "M", "F", "P",
"S", "T", "W", "Y", "V", "B", "Z", "X", "*"), c("A", "R", "N",
"D", "C", "Q", "E", "G", "H", "I", "L", "K", "M", "F", "P", "S",
"T", "W", "Y", "V", "B", "Z", "X", "*")))
It looks like your 'scoringMatrix' file has space delimited columns, and that its input is just 看起来您的“ scoringMatrix”文件具有以空格分隔的列,并且其输入仅为
ex = as.matrix(read.delim("scoringMatrix", sep=""))
which has structure 具有结构
> dput(ex)
structure(c(5L, -2L, -1L, -2L, -2L, 7L, -1L, -2L, -1L, -1L, 7L,
2L, -2L, -2L, 2L, 8L), .Dim = c(4L, 4L), .Dimnames = list(c("A",
"T", "C", "G"), c("A", "T", "C", "G")))
In your input, there were no tab characters \\t
so each line was read in as a single column. 在您的输入中,没有制表符
\\t
因此每一行都作为一列读入。 And row.names=1
means that the single column is assigned as row names -- so you've got 5 rows and zero columns 并且
row.names=1
意味着将单列分配为行名-因此您有5行零列
> read.table("scoringMatrix", sep="\t", header=FALSE, row.names=1)
data frame with 0 columns and 5 rows
Coercing this to a matrix results in a 5 x 0 matrix, and what you see in your original display are the row names (!) of the matrix. 将其强制转换为矩阵会得到5 x 0的矩阵,并且您在原始显示中看到的是矩阵的行名(!)。
This could be created in R 'by hand', as suggested by @DavidArenburg with 如@DavidArenburg建议的那样,可以在R“手工”中创建
matrix(c( 5, -2, -1, -2,
-2, 7, -1, -2,
-1, -1, 7, 2,
-2, -2, 2, 8),
nrow=4, ncol=4,
dimnames=list(
c("A", "C", "G", "T"),
c("A", "C", "G", "T")),
byrow=TRUE)
Another option is to just select the desired column/rows from BLOSUM50
using match
and avoid creating this file manually using a text editor in the first place 另一个选择是使用
match
从BLOSUM50
选择所需的列/行,而首先避免使用文本编辑器手动创建此文件
indx <- match(c("A", "T", "C", "G"), rownames(BLOSUM50))
BLOSUM50[indx, indx]
# A T C G
# A 5 0 -1 0
# T 0 5 -1 -2
# C -1 -1 13 -3
# G 0 -2 -3 8
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.