使用kmeans python進行集群和分區的字典

Question

我尋找解決我的問題的方法。

我使用 sklearn 的 Kmeans，我想要一本帶有{ cluster : list of partition}的字典

kmeans = KMeans(n_clusters=n)
kmeans.fit(data)

result = zip(data,kmeans.labels_)
sortedR = sorted(result,key=lambda x: x[1])

cluster_nb = {}
for k,v in sortedR:
    if v in cluster_nb:
        cluster_nb[v].append(k)
    else:
        cluster_nb[v] = [k]

我將 kmoyen.labels 集群的位置作為關鍵，但我需要 kmoyen.cluster_centers_ 的相應元素

例如：

{'[1,2]' :  [array([1, 3]), array([2,4])], '[5,5]' : [array([7, 8]), array([10,12])]}

我嘗試了一個新循環：

for x in cluster_nb:
    cluster_nb[str(kmeans.cluster_centers_[x])] = cluster_nb.pop(x)
return cluster_nb

但我有這個錯誤：

IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices

我在哪里犯了錯誤？

有沒有更簡單的解決方案？

Answer 1

嘗試這個：

from sklearn.cluster import KMeans
import numpy as np

data = np.random.randint(100, size=(100, 2))
kmeans = KMeans(n_clusters=5)
kmeans.fit(data)

centroids_partitions = {}
for centr in kmeans.cluster_centers_:
    centroid_label = kmeans.predict([centr])
    partition = []
    for k, v in zip(data, kmeans.labels_):
        if v == centroid_label:
            partition.append(k.ravel())

    centroids_partitions[centroid_label[0]] = partition

print(centroids_partitions)

這會返回一個像這樣的字典：

{0: [array([55,  8]), ... ,[truncated], 1: [array([70, 87]), array([77, 63]), ... ]}

其中 0、1 等是來自kmeans.labels_的集群標簽

或者，如果您想將質心協調為字典的鍵，請替換為：

centroids_partitions[centr[0],centr[1]] = partition

輸出：

{(68.29411764705881, 24.470588235294127): [array([72, 19]), array([69,  1]), array([58, 46]), .... ]}

使用kmeans python進行集群和分區的字典

問題描述

1 個解決方案

解決方案1
0 2020-03-26 12:19:36

使用kmeans python進行集群和分區的字典

問題描述

1 個解決方案

解決方案1 0 2020-03-26 12:19:36

解決方案1
0 2020-03-26 12:19:36