I am trying to perform some clustering analysis using three different clustering algorithms. I am loading in data from stdin as follows
import sklearn.cluster as cluster
X = []
for line in sys.stdin:
x1, x2 = line.strip().split()
X.append([float(x1), float(x2)])
X = numpy.array(X)
and then storing my clustering parameters and types in an array as such
clustering_configs = [
### K-Means
['KMeans', {'n_clusters' : 5}],
### Ward
['AgglomerativeClustering', {
'n_clusters' : 5,
'linkage' : 'ward'
}],
### DBSCAN
['DBSCAN', {'eps' : 0.15}]
]
And I am trying to call them in a for loop
for alg_name, alg_params in clustering_configs:
class_ = getattr(cluster, alg_name)
instance_ = class_(alg_params)
instance_.fit_predict(X)
Everything is working correctly except for the instance_.fit_prefict(X)
function. I am getting returned an error
Traceback (most recent call last):
File "meta_cluster.py", line 47, in <module>
instance_.fit_predict(X)
File "/usr/local/lib/python2.7/dist-packages/scikit_learn-0.17.1-py2.7-linux-x86_64.egg/sklearn/cluster/k_means_.py", line 830, in fit_predict
return self.fit(X).labels_
File "/usr/local/lib/python2.7/dist-packages/scikit_learn-0.17.1-py2.7-linux-x86_64.egg/sklearn/cluster/k_means_.py", line 812, in fit
X = self._check_fit_data(X)
File "/usr/local/lib/python2.7/dist-packages/scikit_learn-0.17.1-py2.7-linux-x86_64.egg/sklearn/cluster/k_means_.py", line 789, in _check_fit_data
X.shape[0], self.n_clusters))
TypeError: %d format: a number is required, not dict
Anyone have a clue where I could be going wrong? I read the sklearn docs here and it claims you just need an array-like or sparse matrix, shape=(n_samples, n_features)
which I believe I have.
Any suggestions? Thanks!
class sklearn.cluster.KMeans(n_clusters=8, init='k-means++', n_init=10, max_iter=300, tol=0.0001, precompute_distances='auto', verbose=0, random_state=None, copy_x=True, n_jobs=1, algorithm='auto')[source]
They way you'd call the KMeans class is,
KMeans(n_clusters=5)
With your current code you are calling
KMeans({'n_clusters': 5})
which is causing alg_params to be passed as a Dict instead of a class parameter. Same goes for the other algorithms.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.