简体   繁体   English

如何获取每个节点中 SOM 图中的簇数?

[英]How to get the number of cluster on the SOM plot, in each of the node?

I want to understand to which node my wine is connected after getting a som plot.我想了解在获得 som 情节后我的酒连接到哪个节点。

That's why firstly we need to get data.frame with the name of wine and the number of cluster that wines belongs to.这就是为什么首先我们需要获取带有葡萄酒名称和葡萄酒所属集群数量的 data.frame。 And next step would be to see the number of the cluster on this plot.下一步是查看此图上的集群数量。 But idk how:)但我知道如何:)

data(wines)
View(wines)    
#adding id for each wine

wines<-as.data.frame(wines)
wines$ID <- seq.int(nrow(wines))

#substract the id to know the "name" of wine

som_wines<-wines[,-14]
som_model<-som(scale(som_wines), grid = somgrid(5, 5, "hexagonal"))
som_codes<-as.data.frame(som_model$codes)

#ilustrating needed quantity of clusters

mydata <- as.data.frame(som_model$codes)
wss <- (nrow(mydata)-1)*sum(apply(mydata,2,var)) 
for (i in 2:15) {
  wss[i] <- sum(kmeans(mydata, centers=i)$withinss)
}
plot(wss)

#som plot

som_cluster <- cutree(hclust(dist(som_codes)), 3)
plot(som_model, type="codes",bgcol= som_cluster, main = "Clusters") 
add.cluster.boundaries(som_model, som_cluster)   ` 

#Here we got 3 clusters. Creating the dataframe which defines wines id's to cluster groups.

cluster_details <- data.frame(id=wines$ID, cluster=som_cluster[som_model$unit.classif])

And now I want numbers of clusters to be shown there, on the som plot.现在我希望在 som 图中显示集群的数量。 Are there any suggestions how to cope with that?是否有任何建议如何应对? Would appreciate any answer :)希望得到任何答案:)

You can check to what node each observation belongs to by calling the model's variable namely unit.classif.您可以通过调用模型的变量 unit.classif 来检查每个观察属于哪个节点。 Based on your scripts, you assign the model to som_model.根据您的脚本,您将模型分配给 som_model。 Therefore, you can call因此,您可以调用

som_model$unit.classif

The vector is ordered by the row order of your data, ie your 1st input data belongs to the node denoted by the 1st unit.classif vector value, and so on.该向量按数据的行顺序排序,即您的第一个输入数据属于第一个 unit.classif 向量值表示的节点,依此类推。 You can check by calling您可以拨打电话查询

length(som_model$unit.classif)
nrow(som_wines)

They have the same length.它们具有相同的长度。 The library arrange the nodes in a matrix whose number of dimension of (nodes x features).该库将节点排列在一个矩阵中,其维数为(节点 x 特征)。 If you define your model to have 5x5 nodes, whilst your data have 13 features, so your model's node would be denoted as a 25x13 matrix.如果您将模型定义为具有 5x5 节点,而您的数据具有 13 个特征,则模型的节点将表示为 25x13 矩阵。 You chan check by calling你可以打电话查一下

dim(som_model$codes[[1]])

On the map, the nodes line up from bottom left to top right.在地图上,节点从左下角到右上角排列。 The first node is on the bottom left and the 25th node is on the top right of the codes map.第一个节点位于代码图的左下角,第 25 个节点位于代码图的右上角。 So if you want to know the position of the node to which a particular data belongs to, you can extend your script to something like this:因此,如果您想知道特定数据所属的节点的位置,您可以将脚本扩展为如下所示:

from.bottom <- ceiling(som_model$unit.classif / som_model$grid$xdim)
from.left <- som_model$unit.classif %% som_model$grid$xdim
from.left[from.left == 0] <-  som_model$grid$xdim

cluster_details <- cbind(
  cluster_details, som.unit = som_model$unit.classif,
  from.bottom = from.bottom, from.left = from.left
)

(cluster_details)

the answer is situated here: add clusters and nodes from SOMbrero package to training data答案在这里: 将 SOMbrero 包中的集群和节点添加到训练数据

Particularly in these lines :特别是在这些行中:

SomModel <- som(
    data = TrainingMatrix,
    grid = GridDefinition,
    rlen = 10000,
    alpha = c(0.05, 0.01),
    keep.data = TRUE
)

nb <- table(SomModel$unit.classif)
groups = 5
tree.hc = cutree(hclust(d=dist(SomModel$codes[[1]]),method="ward.D2",members=nb),groups)


result <- OrginalData
result$Cluster <- tree.hc[SomModel$unit.classif]
result$X <- SomModel$grid$pts[SomModel$unit.classif,"x"]
result$Y <- SomModel$grid$pts[SomModel$unit.classif,"y"]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM