[英]Horizontal Dendrogram with Labels in R
I am running into an issue where I can plot a vertical dendrogram with labels but I can't add labels when it is horizontal. 我遇到了一个问题,我可以绘制带有标签的垂直树状图,但在水平时不能添加标签。
My Data looks like this: 我的数据如下所示:
Company Industry1 Industry2 Industry3
Google 3% 5% 6%
Apple 2% 6% 1%
When i import the data, the first column contains my Labels but the rows are just 1, 2, 3 etc. 当我导入数据时,第一列包含我的标签,但行仅为1、2、3等。
So my code reads: Data Source Is called Cluster_D 所以我的代码显示为:数据源称为Cluster_D
labs = Cluster_D[, 1]
Industry <- Cluster_D
rownames(Industry) <- labs$`Company`
D.Industry <- dist(scale(round(Industry[, -1], 3)), method = "euclidean")
H.Industry <- hclust(D.Industry, method = "ward.D")
plot(H.Industry, labels = Cluster_D$`Company`)
So i assign my labels to the variable 'Labs". I then place my data into another variable "Industry". Once i plot the data and pass in Labels i get the chart with the clusters I need. The chart works vertically with labels.....but 因此,我将标签分配给变量“ Labs”,然后将数据放入另一个变量“ Industry”。一旦我绘制了数据并传递了Labels,我便获得了所需簇的图表,该图表垂直于标签工作。 ....但
I have no idea how to get this chart flipped to horizontal and to keep the label names. 我不知道如何使此图表翻转为水平并保持标签名称。 I tried to use
as.dendrogram
function which allows me to use horiz=true
but i cant keep my labels, as it reverts back to 1, 2, 3 etc. 我尝试使用
as.dendrogram
函数,该函数允许我使用as.dendrogram
horiz=true
但是我无法保留标签,因为它会还原为1、2、3等。
Can anyone explain to me how I can get correct myself? 谁能向我解释我如何才能纠正自己? I am used to use Statistica and i didn't have any issues doing hierarchical clustering, I am trying to pick up R. I feel like it should be super easy to assign labels but I just don't know how.
我曾经使用过Statistica,在进行层次结构聚类时没有任何问题,我正在尝试使用R。我觉得分配标签应该超级容易,但我不知道如何。
i tried using the below, but the charts is mislabeled (ABC order). 我尝试使用下面的方法,但是图表贴错了标签(ABC顺序)。
F.Industries <- as.dendrogram(H.Industry)
labels(F.Industries) <- paste(as.character(Cluster_D[,1]))
plot(F.Industries, horiz = TRUE)
As requested by PAR: 根据PAR的要求:
data - I added one more column IBM: 数据-我在IBM中又增加了一列:
z <- read.table(text = "Company Industry1 Industry2 Industry3
Google 3% 5% 6%
Apple 2% 6% 1%
IBM 7% 4% 2%", header = T)
When I try: 当我尝试:
scale(round(z[, -1], 3))
#output
Error in Math.data.frame(list(Industry1 = c(2L, 1L, 3L), Industry2 = c(2L, :
non-numeric variable in data frame: Industry1Industry2Industry3
Meaning the sample data you provided is not representative of your own. 意味着您提供的样本数据并不代表您自己的数据。
Convert to numeric: 转换为数字:
z = data.frame("Company" = z[,1], apply(z[,-1], 2, function(x) as.numeric(gsub("%", "", x))))
Row names are labels for the leaves 行名是叶子的标签
rownames(z) <- z[,1]
D.Industry <- dist(scale(z[, -1]), method = "euclidean")
H.Industry <- hclust(D.Industry, method = "ward.D")
plot(as.dendrogram(H.Industry), horiz = T)
one can adjust the margins with mar
一个人可以用
mar
来调整边距
par(mar=c(2, 0, 0, 8))
plot(as.dendrogram(H.Industry), horiz = T)
other approaches include using ape and ggdendro 其他方法包括使用ape和ggdendro
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.