[英]R: How to extract all labels in a certain node of a dendrogram
I am writing a program that (as a part of it) automatically creates dendrograms from an input dataset.我正在编写一个程序(作为它的一部分)从输入数据集自动创建树状图。 For each node/split I want to extract all the labels that are under that node and the location of that node on the dendrogram plot (for further plotting purposes).对于每个节点/拆分,我想提取该节点下的所有标签以及该节点在树状图上的位置(用于进一步绘图)。 So, let's say my data looks like this:因此,假设我的数据如下所示:
> Ltrs <- data.frame("A" = c(3,1), "B" = c(1,1), "C" = c(2,4), "D" = c(6,6))
> dend <- as.dendrogram(hclust(dist(t(Ltrs))))
> plot(dend)
Now I can extract the location of the splits/nodes:现在我可以提取拆分/节点的位置:
> library(dendextend)
> nodes <- get_nodes_xy(dend)
> nodes <- nodes[nodes[,2] != 0, ]
> nodes
[,1] [,2]
[1,] 1.875 7.071068
[2,] 2.750 3.162278
[3,] 3.500 2.000000
Now I want to get all the labels under a node, for each node (/row from the 'nodes' variable).现在我想获取一个节点下的所有标签,对于每个节点('nodes' 变量中的 /row)。
This should look something like this:这应该是这样的:
$`1`
[1] "D" "C" "B" "A"
$`2`
[1] "C" "B" "A"
$`3 `
[1] "B" "A"
Can anybody help me out?有人可以帮我吗? Thanks in advance :)提前致谢 :)
How about something like this? 这样的事情怎么样?
library(tidyverse)
library(dendextend)
Ltrs <- data.frame("A" = c(3,1), "B" = c(1,1), "C" = c(2,4), "D" = c(6,6))
dend <- as.dendrogram(hclust(dist(t(Ltrs))))
accumulator <- list();
myleaves <- function(anode){
if(!is.list(anode))return(attr(anode,"label"))
accumulator[[length(accumulator)+1]] <<- (reduce(lapply(anode,myleaves),c))
}
myleaves(dend);
ret <- rev(accumulator); #generation was depth first, so root was found last.
Better test this. 更好地测试一下。 I am not very trustworthy. 我不是很值得信赖。 In particular, I really hope the list ret is in an order that makes sense, otherwise it's going to be a pain associating the entries with the correct nodes! 特别是,我真的希望列表ret的顺序合理,否则将条目与正确的节点关联起来会很麻烦! Good luck. 祝好运。
Function partition_leaves()
extracts all leaf labels per each node and makes a list ordered in the same fashion as get_nodes_xy()
output.函数partition_leaves()
提取每个节点的所有叶子标签,并以与get_nodes_xy()
输出相同的方式get_nodes_xy()
一个列表。 With your example,以你的例子,
Ltrs <- data.frame("A" = c(3,1), "B" = c(1,1), "C" = c(2,4), "D" = c(6,6))
dend <- as.dendrogram(hclust(dist(t(Ltrs))))
plot(dend)
partition_leaves(dend)
yields:产量:
[[1]]
[1] "D" "C" "A" "B"
[[2]]
[1] "D"
[[3]]
[1] "C" "A" "B"
[[4]]
[1] "C"
[[5]]
[1] "A" "B"
[[6]]
[1] "A"
[[7]]
[1] "B"
filtering list by vector length will give output similar to the desired one.按向量长度过滤列表将提供类似于所需的输出。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.