I am using the document clustering code available here . I know that k-means is solving a non-convex problem and hence the results of optimization will differ every time I run it, but is there a way to make the clustering reproducible (maybe by fixing some random seed)?
You can fix the random_state parameter of K-means . In the following code I use 42:
km = KMeans(n_clusters=true_k, init='k-means++', max_iter=100, n_init=1,
verbose=opts.verbose,
random_state = 42)
You can check the documentation here .
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.