簡體   English   中英

錯誤:標簽數量為 1。有效值為 2 到 n_samples - 1(含)

[英]Error: Number of labels is 1. Valid values are 2 to n_samples - 1 (inclusive)

我以前替換缺失值、轉換變量和刪除冗余值。 代碼運行:/

from sklearn.metrics import silhouette_samples, silhouette_score  
from sklearn.cluster import KMeans range_n_clusters=[1,2,3,4,5]    
for n_clusters in range_n_clusters:   
clusterer =KMeans(n_clusters=n_clusters, random_state=10)  
cluster_labels=clusterer.fit_predict(df)  

 silhouette_avg=silhouette_score(df, cluster_labels)  
print('For n_clusters=', n_clusters,  
'The aversge silhouette_score is :', silhouette_avg) 
    
sample_silhouette_values = silhouette_samples(df, cluster_kabels)

錯誤:

ValueError                                Traceback (most recent call last)
<ipython-input-40-1bd61ca1e514> in <module>
  7     cluster_labels=clusterer.fit_predict(df)
  8 
----> 9     silhouette_avg=silhouette_score(df, cluster_labels)
 10     print('For n_clusters=', n_clusters,
 11          'The aversge silhouette_score is :', silhouette_avg)

~\anaconda3\lib\site-packages\sklearn\utils\validation.py in inner_f(*args, **kwargs)
 71                           FutureWarning)
 72         kwargs.update({k: arg for k, arg in zip(sig.parameters, args)})
 ---> 73         return f(**kwargs)
 74     return inner_f
 75 

 ~\anaconda3\lib\site-packages\sklearn\metrics\cluster\_unsupervised.py in silhouette_score(X, labels, metric, sample_size, random_state, **kwds)
115         else:
116             X, labels = X[indices], labels[indices]
--> 117     return np.mean(silhouette_samples(X, labels, metric=metric, **kwds))
118 
119 

~\anaconda3\lib\site-packages\sklearn\utils\validation.py in inner_f(*args, **kwargs)
 71                           FutureWarning)
 72         kwargs.update({k: arg for k, arg in zip(sig.parameters, args)})
 ---> 73         return f(**kwargs)
 74     return inner_f
 75 

~\anaconda3\lib\site-packages\sklearn\metrics\cluster\_unsupervised.py in silhouette_samples(X, 
labels, 
metric, **kwds)
227     n_samples = len(labels)
228     label_freqs = np.bincount(labels)
--> 229     check_number_of_labels(len(le.classes_), n_samples)
230 
231     kwds['metric'] = metric

~\anaconda3\lib\site-packages\sklearn\metrics\cluster\_unsupervised.py in 
check_number_of_labels(n_labels, n_samples)
 32     """
 33     if not 1 < n_labels < n_samples:
 ---> 34         raise ValueError("Number of labels is %d. Valid values are 2 "
 35                          "to n_samples - 1 (inclusive)" % n_labels)
 36 

ValueError: Number of labels is 1. Valid values are 2 to n_samples - 1 (inclusive)

您只能對至少 2 個集群進行 kmeans。 k=1 將是沒有任何 label 的數據集本身。因此,如果您實現下面的代碼(注意標識),它應該可以工作:

from sklearn import datasets
iris = datasets.load_iris()
df = iris.data

from sklearn.metrics import silhouette_samples, silhouette_score  
from sklearn.cluster import KMeans 

range_n_clusters=[2,3,4,5]    

for n_clusters in range_n_clusters:   
    clusterer =KMeans(n_clusters=n_clusters, random_state=10)  
    cluster_labels=clusterer.fit_predict(df)  
    
    silhouette_avg=silhouette_score(df, cluster_labels)  
    print('For n_clusters=', n_clusters,'The aversge silhouette_score is :', silhouette_avg) 
    sample_silhouette_values = silhouette_samples(df, cluster_labels)


For n_clusters= 2 The aversge silhouette_score is : 0.681046169211746
For n_clusters= 3 The aversge silhouette_score is : 0.5528190123564091
For n_clusters= 4 The aversge silhouette_score is : 0.4980505049972867
For n_clusters= 5 The aversge silhouette_score is : 0.4887488870931048

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM