R中带有标签的水平树状图

Question

I am running into an issue where I can plot a vertical dendrogram with labels but I can't add labels when it is horizontal. 我遇到了一个问题，我可以绘制带有标签的垂直树状图，但在水平时不能添加标签。

My Data looks like this: 我的数据如下所示：

Company Industry1 Industry2 Industry3
Google     3%        5%        6%
Apple      2%        6%        1%

When i import the data, the first column contains my Labels but the rows are just 1, 2, 3 etc. 当我导入数据时，第一列包含我的标签，但行仅为1、2、3等。

So my code reads: Data Source Is called Cluster_D 所以我的代码显示为：数据源称为Cluster_D

labs = Cluster_D[, 1]
Industry <- Cluster_D
rownames(Industry) <- labs$`Company`


D.Industry <- dist(scale(round(Industry[, -1], 3)), method = "euclidean")
H.Industry <- hclust(D.Industry, method = "ward.D")
plot(H.Industry, labels = Cluster_D$`Company`)

So i assign my labels to the variable 'Labs". I then place my data into another variable "Industry". Once i plot the data and pass in Labels i get the chart with the clusters I need. The chart works vertically with labels.....but 因此，我将标签分配给变量“ Labs”，然后将数据放入另一个变量“ Industry”。一旦我绘制了数据并传递了Labels，我便获得了所需簇的图表，该图表垂直于标签工作。 ....但

I have no idea how to get this chart flipped to horizontal and to keep the label names. 我不知道如何使此图表翻转为水平并保持标签名称。 I tried to use as.dendrogram function which allows me to use horiz=true but i cant keep my labels, as it reverts back to 1, 2, 3 etc. 我尝试使用as.dendrogram函数，该函数允许我使用as.dendrogram horiz=true但是我无法保留标签，因为它会还原为1、2、3等。

Can anyone explain to me how I can get correct myself? 谁能向我解释我如何才能纠正自己？ I am used to use Statistica and i didn't have any issues doing hierarchical clustering, I am trying to pick up R. I feel like it should be super easy to assign labels but I just don't know how. 我曾经使用过Statistica，在进行层次结构聚类时没有任何问题，我正在尝试使用R。我觉得分配标签应该超级容易，但我不知道如何。

i tried using the below, but the charts is mislabeled (ABC order). 我尝试使用下面的方法，但是图表贴错了标签（ABC顺序）。

F.Industries <- as.dendrogram(H.Industry)
labels(F.Industries) <- paste(as.character(Cluster_D[,1]))
plot(F.Industries, horiz = TRUE)

Answer 1

As requested by PAR: 根据PAR的要求：

data - I added one more column IBM: 数据-我在IBM中又增加了一列：

z <- read.table(text = "Company Industry1 Industry2 Industry3
Google     3%        5%        6%
Apple      2%        6%        1%
IBM        7%        4%        2%", header = T)

When I try: 当我尝试：

scale(round(z[, -1], 3))
#output
Error in Math.data.frame(list(Industry1 = c(2L, 1L, 3L), Industry2 = c(2L,  : 
  non-numeric variable in data frame: Industry1Industry2Industry3

Meaning the sample data you provided is not representative of your own. 意味着您提供的样本数据并不代表您自己的数据。

Convert to numeric: 转换为数字：

z = data.frame("Company" = z[,1], apply(z[,-1], 2, function(x) as.numeric(gsub("%", "", x))))

Row names are labels for the leaves 行名是叶子的标签

rownames(z) <- z[,1]

D.Industry <- dist(scale(z[, -1]), method = "euclidean")
H.Industry <- hclust(D.Industry, method = "ward.D")

plot(as.dendrogram(H.Industry), horiz = T)

one can adjust the margins with mar 一个人可以用mar来调整边距

par(mar=c(2, 0, 0, 8))
plot(as.dendrogram(H.Industry), horiz = T)

other approaches include using ape and ggdendro 其他方法包括使用ape和ggdendro

R中带有标签的水平树状图

问题描述

1 个解决方案

解决方案1
0 2017-11-08 23:56:32

R中带有标签的水平树状图

问题描述

1 个解决方案

解决方案1 0 2017-11-08 23:56:32

解决方案1
0 2017-11-08 23:56:32