简体   繁体   English

基于kmeans聚类中心绘制边界线

[英]Drawing boundary lines based on kmeans cluster centres

I'm quite new to scikit learn, but wanted to try an interesting project.我是 scikit 学习的新手,但想尝试一个有趣的项目。

I have longitude and latitudes for points in the UK, which I used to create cluster centers using scikit learns KMeans class.我有英国点的经度和纬度,我曾经使用 scikit 学习 KMeans 类来创建聚类中心。 To visualise this data, rather than having the points as clusters, I wanted to instead draw boundaries around each cluster.为了可视化这些数据,而不是将点作为集群,我想在每个集群周围绘制边界。 For example, if one cluster was London and the other Oxford, I currently have a point at the center of each city, but I was wondering if there's a way to use this data to create a boundary line based on my clusters instead?例如,如果一个集群是伦敦,另一个集群是牛津,我目前在每个城市的中心都有一个点,但我想知道是否有办法使用这些数据来创建基于我的集群的边界线?

Here is my code so far to create the clusters:到目前为止,这是我创建集群的代码:

import pandas as pd
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans

location1="XXX"
df = pd.read_csv(location1, encoding = "ISO-8859-1")

#Run kmeans clustering
X = df[['long','lat']].values #~2k locations in the UK
y=df['label'].values   #Label is a 0 or 1
kmeans = KMeans(n_clusters=30, random_state=0).fit(X, y)
centers=kmeans.cluster_centers_
plt.scatter(centers[:,0],centers[:,1], marker='s', s=100)

So I would like to be able to convert the centers in the above example to lines that demarcate each of the regions -- is this possible?所以我希望能够将上面示例中的中心转换为划分每个区域的线——这可能吗?

Thanks,谢谢,

Anant一只蚂蚁

I guess you're talking about spatial boundaries, in this case you should follow Bunyk's recommendation and use a Voronoi Diagram [ 1 ].我猜你是在谈论空间边界,在这种情况下你应该遵循 Bunyk 的建议并使用 Voronoi 图 [ 1 ]。 Here is a practical demonstration of what you could achieve: http://nbviewer.jupyter.org/gist/pv/8037100 .这是您可以实现的实际演示:http: //nbviewer.jupyter.org/gist/pv/8037100

You can use Scipi to generate a Voronoi Diagram.您可以使用 Scipi 生成 Voronoi 图。 docs 文档

For your code it would be对于您的代码,它将是

from scipy.spatial import Voronoi, voronoi_plot_2d
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans

location1="XXX"
df = pd.read_csv(location1, encoding = "ISO-8859-1")

#Run kmeans clustering
X = df[['long','lat']].values #~2k locations in the UK
y=df['label'].values   #Label is a 0 or 1
kmeans = KMeans(n_clusters=30, random_state=0).fit(X, y)
centers=kmeans.cluster_centers_

plt.scatter(centers[:,0],centers[:,1], marker='s', s=100)


vor = Voronoi(centers)
fig = voronoi_plot_2d(vor,plt.gca())

plt.show()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM