簡體   English   中英

如何使用 proxy::dist 應用自定義 function 以在 R 中創建距離矩陣

[英]How to apply a custom function with proxy::dist to create a distance matrix in R

我已經定義了一個自定義 function 並測試了 function 以確保它有效,但我無法將它應用於列表以獲得距離矩陣。

我的代碼是:

library(Biostrings)
library(proxy)

#import the sequences using Biostrings
indf<-readAAStringSet("C:/Users/jamie/OneDrive/Documents/Junk/SAMPLEFASTA.fasta")

#Assign the names and sequences to different variables
seqAAname<-names(indf)
seqz<-paste(indf)

#Put just the sequences into a dataframe
indf2<-data.frame(seqz)

#Convert the sequences into a list
indf3<-as.list(indf2)

#Define a custom function to return the alignment score between two sequences (pairwise)
customalnfunc <- function(X, Y){
  pairwiseAlignment(X, Y,
                    substitutionMatrix = "BLOSUM45", gapOpening = 1, gapExtension = 3)
}

#Test the function but not as a function (This works fine)
testfreefunc<-  pairwiseAlignment(AAString("PEHQRSTVE"),AAString("PQHQRETVE"),
                    substitutionMatrix = "BLOSUM45", gapOpening = 1, gapExtension = 3)
print(testfreefunc@score)


#Test the function as a fucntion to make sure it works (This works fine)
testfuncout <- customalnfunc(AAString("PEHQRSTVE"),AAString("PQHQRETVE"))
print(testfuncout@score)

#Apply the custom function to all possible pairs using proxy::dist with the custom function (This does not work, it returns 0)
outalnmatrix <- proxy::dist(indf3, method = customalnfunc)
outalnmatrix

SAMPLEFASTA.fasta 文件包含:

>SeqA
PEHQRSTVE
>SeqB
PQHQRETVE
>SeqC
RQHERSEVE

來自 outalnmatrix 的所需 output 是: 在此處輸入圖像描述

我嘗試將輸入數據作為列表和矩陣傳遞給 proxy::dist。

我怎樣才能使這項工作?

您不需要使用proxy package,因為proxy::dist用於比較矩陣/數據幀的行。 既然你想比較字符串,你可以使用outer 但是,您需要調整您的customalnfunc function,以便它僅返回一個數字 ( scoreOnly = TRUE )。

library(Biostrings)

seqz <- c("PEHQRSTVE", "PQHQRETVE", "RQHERSEVE")

customalnfunc <- function(X, Y){
  pairwiseAlignment(X, Y,
                    substitutionMatrix = "BLOSUM45",
                    gapOpening = 1,
                    gapExtension = 3,
                    scoreOnly = TRUE)
}

outer(seqz, seqz, customalnfunc)

#>
     [,1] [,2] [,3]
[1,]   58   50   33
[2,]   50   60   33
[3,]   33   33   57


暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM