简体   繁体   English

R如何可视化成对对齐

[英]R How to visualize pairwise alignment

How to visualize the complete alignment of two sequences? 如何可视化两个序列的完全比对?

library(Biostrings)
s1 <-DNAString("ACTTCACCAGCTCCCTGGCGGTAAGTTGATCAAAGGAAACGCAAAGTTTTCACTTCACCAGCTCCCTGGCGGTAAGTTGATCAAAGGAAACGCAAAGTTTTCAAGAAGACTTCACCAGCTCCCTGGCGGTAAGTTGATCAAAGGAAACGCAAAGTTTTCAAG")
s2 <-DNAString("GTTTCACTACTTCCTTTCGGGTAAGTAAATATATGTTTCACTACTTCCTTTCGGGTAAGTGTTTCACTACTTCCTTTCGGGTAAGTAAATATATAAATATATAAAAATATAATTTTCATCAAATATATAAATATATAAAAATATAATTTTCATCAAATATATAAAAATATAATTTTCATC")
pairwiseAlignment(s1,s2)

Output: 输出:

Global PairwiseAlignmentsSingleSubject (1 of 1)
pattern: [1] ACTTCACCAGCTCCCTGGCGGTAAGTTGATCAAAGGAAACGCAAAGT--TTTCAC---...CTTCACCAGCTCCCTGGCGGTAAGTTG-ATCAAAGG---AAACGCAAAGTTTTCAAG 
subject: [1] GTTTCACTACTTCCTTTCGGGTAAGTAAAT-ATATGTTTCACTACTTCCTTTCGGGTA...TATATAAATATATAAAAATATAATTTTCATCAAATATATAAAAATATAATTTTCATC 
score: -394.7115 

Here, only a part of alignment has been shown? 在这里,仅显示了部分对齐? Do you know of any existing functions that either plot or print the alignment? 您知道绘制或打印路线的任何现有功能吗?

You can find information and details on how to extract the aligned pattern and subject sequences under ?pairwiseAlignments . 您可以在?pairwiseAlignments下找到有关如何提取对齐的模式和主题序列的信息和详细信息。

Here is an example based on the sample data you provide: 这是一个基于您提供的样本数据的示例:

  1. Store the pairwise alignment in a PairwiseAlignmentsSingleSubject object 将成对对齐方式存储在PairwiseAlignmentsSingleSubject对象中

     alg <- pairwiseAlignment(s1,s2) 
  2. Extract the aligned pattern and subject sequences and merge them into a DNAStringSet object. 提取对齐的模式和主题序列,并将它们合并到DNAStringSet对象中。

     seq <- c(alignedPattern(alg), alignedSubject(alg)) 
  3. You can access the full sequences with as.character 您可以使用as.character访问完整序列

     as.character(seq) [1] "ACTTCACCAGCTCCCTGGCGGTAAGTTGATCAAAGGAAACGCAAAGT--TTTCAC--------TTCACCAGCTCCCTGGCGGTAAGTTGATC---AAAGG---AAACGCAAAGTTTTCAAGAAGACTTCACCAGCTCCCTGGCGGTAAGTTG-ATCAAAGG---AAACGCAAAGTTTTCAAG" [2] "GTTTCACTACTTCCTTTCGGGTAAGTAAAT-ATATGTTTCACTACTTCCTTTCGGGTAAGTGTTTCACTACTTCCTTTCGGGTAAGTAAATATATAAATATATAAAAATATAATTTTCATCAA-ATATATAAATATATAAAAATATAATTTTCATCAAATATATAAAAATATAATTTTCATC" 

    It seems that alignedPattern and alignedSubject were added to Biostrings very recently. 似乎alignedPatternalignedSubject是最近才添加到Biostrings Alternatively you can do 或者你可以做

     seq <- c(aligned(pattern(alg)), aligned(subject(alg))) 

    but note that this will trim globally aligned sequences (see details ). 但请注意,这会修剪全局对齐的序列(请参阅详细信息 )。

  4. There exists a nice R/Bioconductor package DECIPHER which offers a method to visualise XStringSet data in a web browser. 有一个不错的R / Bioconductor软件包DECIPHER ,它提供了一种在Web浏览器中可视化XStringSet数据的方法。 It automatically adds colour-coding and a consensus sequence at the bottom. 它会在底部自动添加颜色编码和共识序列。 In your case, you would do 在你的情况下,你会做

     library(DECIPHER) BrowseSeqs(seq) 

    在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM