Reproducible kmeans in sklearn

Question

I am using the document clustering code available here . I know that k-means is solving a non-convex problem and hence the results of optimization will differ every time I run it, but is there a way to make the clustering reproducible (maybe by fixing some random seed)?

Answer 1

You can fix the random_state parameter of K-means . In the following code I use 42:

km = KMeans(n_clusters=true_k, init='k-means++', max_iter=100, n_init=1, 
                               verbose=opts.verbose,
                               random_state = 42)

You can check the documentation here .

Reproducible kmeans in sklearn

Question

1 answers

solution1
2 ACCPTED 2016-04-04 17:33:10

Reproducible kmeans in sklearn

Question

1 answers

solution1 2 ACCPTED 2016-04-04 17:33:10

solution1
2 ACCPTED 2016-04-04 17:33:10