简体   繁体   English

R:如何提取树状图某个节点中的所有标签

[英]R: How to extract all labels in a certain node of a dendrogram

I am writing a program that (as a part of it) automatically creates dendrograms from an input dataset.我正在编写一个程序(作为它的一部分)从输入数据集自动创建树状图。 For each node/split I want to extract all the labels that are under that node and the location of that node on the dendrogram plot (for further plotting purposes).对于每个节点/拆分,我想提取该节点下的所有标签以及该节点在树状图上的位置(用于进一步绘图)。 So, let's say my data looks like this:因此,假设我的数据如下所示:

> Ltrs <- data.frame("A" = c(3,1), "B" = c(1,1), "C" = c(2,4), "D" = c(6,6))
> dend <- as.dendrogram(hclust(dist(t(Ltrs))))
> plot(dend)

The dendrogram树状图

Now I can extract the location of the splits/nodes:现在我可以提取拆分/节点的位置:

> library(dendextend)
> nodes <- get_nodes_xy(dend)
> nodes <- nodes[nodes[,2] != 0, ]
> nodes
      [,1]     [,2]
[1,] 1.875 7.071068
[2,] 2.750 3.162278
[3,] 3.500 2.000000

Now I want to get all the labels under a node, for each node (/row from the 'nodes' variable).现在我想获取一个节点下的所有标签,对于每个节点('nodes' 变量中的 /row)。

This should look something like this:这应该是这样的:

$`1`
[1] "D" "C" "B" "A"

$`2`
[1] "C" "B" "A"

$`3 `
[1] "B" "A"

Can anybody help me out?有人可以帮我吗? Thanks in advance :)提前致谢 :)

How about something like this? 这样的事情怎么样?

library(tidyverse)
library(dendextend)
Ltrs <- data.frame("A" = c(3,1), "B" = c(1,1), "C" = c(2,4), "D" = c(6,6))
dend <- as.dendrogram(hclust(dist(t(Ltrs))))

accumulator <- list();
myleaves <- function(anode){
    if(!is.list(anode))return(attr(anode,"label"))
    accumulator[[length(accumulator)+1]] <<- (reduce(lapply(anode,myleaves),c))
}

myleaves(dend);
ret <- rev(accumulator); #generation was depth first, so root was found last.

Better test this. 更好地测试一下。 I am not very trustworthy. 我不是很值得信赖。 In particular, I really hope the list ret is in an order that makes sense, otherwise it's going to be a pain associating the entries with the correct nodes! 特别是,我真的希望列表ret的顺序合理,否则将条目与正确的节点关联起来会很麻烦! Good luck. 祝好运。

Function partition_leaves() extracts all leaf labels per each node and makes a list ordered in the same fashion as get_nodes_xy() output.函数partition_leaves()提取每个节点的所有叶子标签,并以与get_nodes_xy()输出相同的方式get_nodes_xy()一个列表。 With your example,以你的例子,

Ltrs <- data.frame("A" = c(3,1), "B" = c(1,1), "C" = c(2,4), "D" = c(6,6))
dend <- as.dendrogram(hclust(dist(t(Ltrs))))
plot(dend)

partition_leaves(dend)

yields:产量:

[[1]]
[1] "D" "C" "A" "B"

[[2]]
[1] "D"

[[3]]
[1] "C" "A" "B"

[[4]]
[1] "C"

[[5]]
[1] "A" "B"

[[6]]
[1] "A"

[[7]]
[1] "B"

filtering list by vector length will give output similar to the desired one.按向量长度过滤列表将提供类似于所需的输出。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM