[英]Dict of cluster and partition with kmeans python
我尋找解決我的問題的方法。
我使用 sklearn 的 Kmeans,我想要一本帶有{ cluster : list of partition}
的字典
kmeans = KMeans(n_clusters=n)
kmeans.fit(data)
result = zip(data,kmeans.labels_)
sortedR = sorted(result,key=lambda x: x[1])
cluster_nb = {}
for k,v in sortedR:
if v in cluster_nb:
cluster_nb[v].append(k)
else:
cluster_nb[v] = [k]
我將 kmoyen.labels 集群的位置作為關鍵,但我需要 kmoyen.cluster_centers_ 的相應元素
例如 :
{'[1,2]' : [array([1, 3]), array([2,4])], '[5,5]' : [array([7, 8]), array([10,12])]}
我嘗試了一個新循環:
for x in cluster_nb:
cluster_nb[str(kmeans.cluster_centers_[x])] = cluster_nb.pop(x)
return cluster_nb
但我有這個錯誤:
IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices
我在哪里犯了錯誤?
有沒有更簡單的解決方案?
嘗試這個:
from sklearn.cluster import KMeans
import numpy as np
data = np.random.randint(100, size=(100, 2))
kmeans = KMeans(n_clusters=5)
kmeans.fit(data)
centroids_partitions = {}
for centr in kmeans.cluster_centers_:
centroid_label = kmeans.predict([centr])
partition = []
for k, v in zip(data, kmeans.labels_):
if v == centroid_label:
partition.append(k.ravel())
centroids_partitions[centroid_label[0]] = partition
print(centroids_partitions)
這會返回一個像這樣的字典:
{0: [array([55, 8]), ... ,[truncated], 1: [array([70, 87]), array([77, 63]), ... ]}
其中 0、1 等是來自kmeans.labels_
的集群標簽
或者,如果您想將質心協調為字典的鍵,請替換為:
centroids_partitions[centr[0],centr[1]] = partition
輸出:
{(68.29411764705881, 24.470588235294127): [array([72, 19]), array([69, 1]), array([58, 46]), .... ]}
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.