简体   繁体   中英

Trying to root a phylogenetic tree using phangorn package in R

I am trying to root a phylog.netic tree (that I previously created using DECIPHER and phagorn with 16S data for a microbiome study) using the phangorn package in R. However, I am running into an issue that states that the number of nodes needs to be greater than the number of taxa. The tree has 2151 nodes. The associated sequences were grouped into 2153 taxa using a Dada2 pipeline. The sequences in the tree and in refseq within the phyloseq object (ps) are the exact same. I'm new to this. Please let me know if you need any additional information. Thank you in advance!

All relevant code up to the part in which an error occurred:

#Extract sequences from the dada2 output object:

sequences <- getSequences(seqtab.nochim)
names(sequences) <- sequences

#Run sequence alignment (MSA) using DECIPHER:

alignment <- AlignSeqs(DNAStringSet(sequences), anchor=NA)

#Change sequence alignment output into a phyDat structure

phang.align <- phyDat(as(alignment, "matrix"), type="DNA")

#Create distance matrix

dm <- dist.ml(phang.align)

#Perform Neighbor joining

treeNJ <- NJ(dm)

#Note, tip order is not sequence order

#Internal maximum likelihood

fit <- pml(treeNJ, data=phang.align)

#negative edges length changed to 0!

#Fit the tree #Note: this step may take quite a while...

optim.pml()
fitGTR <- update(fit, k=4, inv=0.2)`
fitGTR <- optim.pml(fitGTR, model="GTR", optInv=TRUE, optGamma=TRUE,
                    rearrangement = "stochastic", control = pml.control(trace = 0))

#Import into existing phyloseq object "ps":

ps@phy_tree <- fitGTR$tree
ps@phy_tree

#Phylog.netic tree with 2153 tips and 2151 internal nodes - good!

#Need to root it for phylogeny-based diversity metrics (Unifrac)

phy_tree(ps) <- root(phy_tree(ps), taxa_names(ps), 1, resolve.root = TRUE)
Error in root.phylo(phy_tree(ps), taxa_names(ps), 1, resolve.root = TRUE) : incorrect node#: should be greater than the number of taxa

ape::root expects the id of the new root node. The first nodes are the leaves. It does not make sense to root a tree on a leaf, thus the error. A leaf does not have children and a root must have children, otherwise it is an empty tree.

Use the function as_tibble from the tidytree package to get a table of node metadata to extract the desired node id:

library(ape)
library(tidytree)

# create example tree
tree <- read.tree(text='((A, B), ((C, D), (E, F)));')
tree <- makeNodeLabel(tree)

as_tibble(tree)
#> # A tibble: 11 x 3
#>    parent  node label
#>     <int> <int> <chr>
#>  1      8     1 A    
#>  2      8     2 B    
#>  3     10     3 C    
#>  4     10     4 D    
#>  5     11     5 E    
#>  6     11     6 F    
#>  7      7     7 Node1
#>  8      7     8 Node2
#>  9      7     9 Node3
#> 10      9    10 Node4
#> 11      9    11 Node5
plot(tree, show.node.label = TRUE)

# Root tree on leaf E (5th node)
# Does not make sense
tree2 <- root(phy = tree, node = 5)
#> Error in root.phylo(phy = tree, node = 5): incorrect node#: should be greater than the number of taxa

# Root tree on internal Node2 (8th node)
tree3 <- root(phy = tree, node = 8)
plot(tree3, show.node.label = TRUE)

Created on 2022-02-10 by the reprex package (v2.0.0)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM