繁体   English   中英

基于R中部分匹配的分类单元标签的系统发育树中的折叠分支

[英]Collapse branches in a phylogenetic tree based on partially matching taxa labels in R

我为DNA细菌区域建立了系统发育树,在该区域中,通常相同的细菌物种聚集在紧密的分支中。 现在,我想折叠具有共同标签的分支。 我尝试根据以下与终端分类单元名称部分匹配的关键字来定义要折叠的标签:

关键字:

("vulneris","ulcerans","blattae","coli","hermannii","albertii","periodonticum","fergusonii")

在R中,我上传了以下file.newick:

(((((((((E_vulneris_otu44:0.03924,((E_vulneris_otu97:0.00766,
E_vulneris_otu96:0)0.8:0.00914,E_fergusonii_otu74:0.00725)0:0.0072)0:0,
((E_vulneris_otu95:0,
(((gi_undefined_HMPREF0402_04011_HMPREF0402_04011_E_ulcerans:0,
fig_768594rna24_RO08_01535_E_vulneris:0)0:0.00373,
(gi_undefined_HMPREF1766_00665_HMPREF1766_00665_E_vulneris:0,
fig_768595rna53_CBG60_05850_E_vulneris:0)0:0.00373)0.8:0.00701,
fig_7685910rna43_CI114_11510_E_vulneris:0)0.84:0.00717)0:0,
E_fergusonii_otu78:0.0072)0.85:0.00718)0:0,E_vulneris_otu94:0)0.82:0.00753,
E_vulneris_otu77:0)0.82:0.00698,(E_vulneris_otu93:0,((E_vulneris_otu89:0,
E_vulneris_otu90:0.00754)0:0.00765,E_vulneris_otu91:0)0.83:0.01608)0:0)
0.8:0.02319,(((E_vulneris_otu35:0,E_vulneris_otu34:0.00752)0.83:0.00766,
E_vulneris_otu28:0.00688)0:0,(E_vulneris_otu2:0.01715,E_vulneris_otu1:0)
0.89:0.01482)0.8:0.01541)0.89:0.02013,E_periodonticum_otu73:0)0.75:0.01535,
fig_86016rna55_CTM98_06410_E_periodonticum:0.00831)0.97:0.1808,
((((((E_blattae_otu76:0,E_blattae_otu75:0.01744)0.82:0.00698,
(E_blattae_otu4:0.00771,E_blattae_otu39:0)0.8:0.00762)0:0,
((gi_undefined_HMPREF1540_00319_HMPREF1540_00319_E_vulneris:0,
fig_8616rna58_DXA30_07775_E_ulcerans:0)0.81:0.00724,
gi_undefined_C4N16_02505_E_albertii:0)0.92:0.01676)0.78:0.01261,
E_blattae_otu92:0.004)0.78:0.02469,(((E_coli_otu8:0.01561,
E_coli_otu38:0.00378)0:0.00378,E_coli_otu33:0)0:0,
(((E_coli_otu54:0.00713,gi_undefined_C4N19_02700_E_coli:0)
0.73:0.00675,(((E_coli_otu57:0,E_coli_otu43:0.00715)0.84:0.00715,
E_coli_otu53:0)0.79:0.00852,((((E_coli_otu40:0,
E_coli_otu56:0.0076)0:0.00376,E_coli_otu55:0.00703)0:0.00376,
E_coli_otu37:0)0:0.0028,(E_coli_otu41:0,E_coli_otu4:0.00715)
0.9:0.00714)0:0.00395)0.79:0.00862)0.77:0.00764,E_coli_otu36:0)
0.82:0.00761)0.89:0.04396)0.83:0.0832,(gi_undefined_C4N18_07110_E_blattae:0,
gi_undefined_FUSO3_01390_E_hermannii:0.04598)0.92:0.1457)0.97:0.1015);
tree.test<-read.tree(file = "file.newick")

并使用ape和phytools软件包构建树:

ggtree(tree.test) + geom_tiplab()

但我不知道如何在关键字级别上崩溃。 任何建议将不胜感激。 谢谢!

一种方法是使用ape::drop.tip函数删除所有OTU,但在每个物种组中删除一个ape::drop.tip

library(ape)

## List of clades
clades <- c("vulneris","ulcerans","blattae","coli","hermannii","albertii","periodonticum","fergusonii")

## New tree placeholder
trimmed_tree <- tree.test

## Loop through each tip to drop
for(one_clade in clades) {
    ## Find the tips matching the species name
    species <- grep(one_clade, trimmed_tree$tip.label)
    ## Removing all the species but the first one
    trimmed_tree <- drop.tip(trimmed_tree, trimmed_tree$tip.label[species[-1]])
}

## Displaying the trimmed tree (with one OTU per species)
plot(trimmed_tree)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM