Scikit-learn KMeans clustering - fit cluster with X features, predict cluster membership with X-1 features?

Question

I am currently trying to solve some kind of a regression task (predict a value of 'count' field) using a KMeans clustering. The idea is trivial:

Fit a cluster on my test dataset:

 k_means = cluster.KMeans(n_clusters=4, n_init = 20, init='random')
 k_means.fit(df[['DistanceToMidnight','season','DayType','weather','temp','atemp','humidity','windspeed','count']])

*notice that I do use 'count' in clustering.

Then I want to use my test set (which is much the same, except it hasn't 'count' field) - I want to determine cluster membership using all features EXCEPT 'count' and then assign 'count' to each row in test set to the 'count'-related coordinate of assigned cluster-center.

Any ideas how to simply do this using standard functions of KMeans cluster? I can't just call 'k_means.predict' since it will fail due to features number mismatch.

The simplest way I could think of is to construct a k_means clustering object using provided cluster centers from already trained clustering. But I am not sure how to do this. Is it possible to create new cluster.KMeans object by providing it with already defined cluster centroids?

Answer 1

Find the nearest cluster center
Use the missing value from the center

If you stick to the k-means principle, your best prediction value is the value that was assigned to the center; unless you eg build a regression model for each cluster independently.

Answer 2

You can first calculate all the centroids using K-Means. Then compute euclidean distance from sklearn.metrics from every point to all the centroids (except those you want to exclude). Finally, get the cluster that minimizes the distance ( np.argmin along 2nd axis) for each point.

Scikit-learn KMeans clustering - fit cluster with X features, predict cluster membership with X-1 features?

Question

2 answers

solution1
1 2015-01-29 17:13:33

solution2
1 ACCPTED 2015-01-30 09:16:19

Scikit-learn KMeans clustering - fit cluster with X features, predict cluster membership with X-1 features?

Question

2 answers

solution1 1 2015-01-29 17:13:33

solution2 1 ACCPTED 2015-01-30 09:16:19

solution1
1 2015-01-29 17:13:33

solution2
1 ACCPTED 2015-01-30 09:16:19