简体繁体中英

What is meant by the term ‘random-state’ in 'KMeans' function in package 'sklearn.cluster' in python

原文 2017-09-08 04:39:45 0 4 python/ cluster-analysis

What is meant by "random-state" in python KMeans function? I tried to find out from Google and referred https://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html but I could not find a precise answer.

4 answers

A gotcha with the k-means alogrithm is that it is not optimal. That means, it is not sure to find the best solution, as the problem is not convex (for the optimisation).

You may be stuck into local minima, and hence the result of your algorithm depends of your initialization (of your centroids). A good practice in order to find a good minimum is to rerun the algortihm several times with several initializations and keep the best result.

As stated by the others, random_state makes the results reproducible and can be useful for debugging

Bear in mind that the KMeans function is stochastic (the results may vary even if you run the function with the same inputs' values). Hence, in order to make the results reproducible, you can specify a value for the random_state parameter.

Random state in Kmeans function of sklearn mainly helps to

Start with same random data point as centroid if you use Kmeans++ for initializing centroids.
Start with same K random data points as centroid if you use random initialization.

This helps when one wants to reproduce results at some later point in time.

random_state : int, RandomState instance or None, optional, default: None If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.

See: http://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html

Getting a memory error when using sklearn.cluster Kmeans

Problems with the random-state parameter on data splitting with sklearn

how to solve error module sklearn.cluster?

Cluster datapoints using kmeans sklearn in python

Value at KMeans.cluster_centers_ in sklearn KMeans

python/sklearn - how to get clusters and cluster names after doing kmeans

ImportError: No module named sklearn.cluster in dbscan example

How to assign sample_weights in sklearn.cluster DBSCAN?

Why are KMeans cluster labels not always the same with set random_state?

Cannot use Kmeans Cluster inside a python function

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Getting a memory error when using sklearn.cluster Kmeans Problems with the random-state parameter on data splitting with sklearn how to solve error module sklearn.cluster? Cluster datapoints using kmeans sklearn in python Value at KMeans.cluster_centers_ in sklearn KMeans python/sklearn - how to get clusters and cluster names after doing kmeans ImportError: No module named sklearn.cluster in dbscan example How to assign sample_weights in sklearn.cluster DBSCAN? Why are KMeans cluster labels not always the same with set random_state? Cannot use Kmeans Cluster inside a python function

Related Tags

What is meant by the term ‘random-state’ in 'KMeans' function in package 'sklearn.cluster' in python

Question

4 answers

solution1
6 ACCPTED 2017-09-08 06:21:21

solution2
4 2017-09-08 04:51:04

solution3
3 2019-11-19 14:27:10

solution4
2 2017-09-08 04:48:33

What is meant by the term ‘random-state’ in 'KMeans' function in package 'sklearn.cluster' in python

Question

4 answers

solution1 6 ACCPTED 2017-09-08 06:21:21

solution2 4 2017-09-08 04:51:04

solution3 3 2019-11-19 14:27:10

solution4 2 2017-09-08 04:48:33

solution1
6 ACCPTED 2017-09-08 06:21:21

solution2
4 2017-09-08 04:51:04

solution3
3 2019-11-19 14:27:10

solution4
2 2017-09-08 04:48:33