[英]How to compare if two contour paths are visually similar - Python / Matplotlib
[英]How to visually compare clusters using python?
我正在研究用於客戶細分的 k-means 聚類。 我的輸入數據有 12 個特征和 7315 行。
因此,我嘗試了下面的代碼來執行 k-means
kmeans = KMeans(n_clusters = 5, init = "k-means++", random_state = 42)
data_normalized['y_kmeans'] = kmeans.fit_predict(data_normalized)
為了可視化,我嘗試了以下代碼
u_labels = np.unique(data_normalized['y_kmeans'])
#plotting the results:
for i in u_labels:
plt.scatter(data_normalized[y_kmeans == i , 0] , data_normalized[y_kmeans == i , 1] , label = i)
plt.legend()
plt.show()
我收到如下錯誤
TypeError: '(array([False, False, False, ..., False, False, False]), 0)' is an invalid key
InvalidIndexError: (array([False, False, False, ..., False, False, False]), 0)
如何可視化我的集群以查看它們之間的距離?
由於我沒有您的數據集,因此我按如下方式模擬了您的數據框:(我假設了 9 個不同的集群組)
d={'col1': [i/100 for i in random.choices(range(1,100), k=7315)],
'col2':[i/100 for i in random.choices(range(1,100), k=7315)],
'y_kmeans':random.choices(range(1,10), k=7315)}
data_normalized = pd.DataFrame(d)
之后,您可以按如下方式繪制集群,
import numpy as np
import random
import pandas as pd
import matplotlib.pyplot as plt
u_labels = np.unique(data_normalized['y_kmeans']).tolist()
scatter = plt.scatter(data_normalized['col1'], data_normalized['col2'],
c=data_normalized['y_kmeans'], cmap='tab20')
plt.legend(handles=scatter.legend_elements()[0], labels=u_labels)
plt.show()
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.