[英]Using a subset of Pandas dataframe with Scipy Kmeans?
I have a data frame that I import using df = pd.read_csv('my.csv',sep=',')
. 我有一个数据框,我使用
df = pd.read_csv('my.csv',sep=',')
导入。 In that CSV file, the first row is the column name, and the first column is the observation name. 在该CSV文件中,第一行是列名,第一列是观察名。
I know how to select a subset of the Panda dataframe, using: 我知道如何选择Panda数据帧的子集,使用:
df.iloc[:,1::]
which gives me only the numeric values. 这只给我数值。 But when I try and use this with
scipy.cluster.vq.kmeans
using this command, 但是当我尝试使用此命令与
scipy.cluster.vq.kmeans
使用时,
kmeans(df.iloc[:,1::],3)
I get the error 'DataFrame' object has no attribute 'dtype'
我收到错误
'DataFrame' object has no attribute 'dtype'
Any suggestions? 有什么建议么?
Here is an example to use KMeans. 以下是使用KMeans的示例。
from sklearn.datasets import make_blobs
from itertools import product
import numpy as np
import pandas as pd
from sklearn.cluster import KMeans
# try to simulate your data
# =====================================================
X, y = make_blobs(n_samples=1000, n_features=10, centers=3)
columns = ['feature' + str(x) for x in np.arange(1, 11, 1)]
d = {key: values for key, values in zip(columns, X.T)}
d['label'] = y
data = pd.DataFrame(d)
Out[72]:
feature1 feature10 feature2 ... feature8 feature9 label
0 1.2324 -2.6588 -7.2679 ... 5.4166 8.9043 2
1 0.3569 -1.6880 -5.7671 ... -2.2465 -1.7048 0
2 1.0177 -1.7145 -5.8591 ... -0.5755 -0.6969 0
3 1.5735 -0.0597 -4.9009 ... 0.3235 -0.2400 0
4 -0.1042 -1.6703 -4.0541 ... 0.4456 -1.0406 0
.. ... ... ... ... ... ... ...
995 -0.0983 -1.4569 -3.5179 ... -0.3164 -0.6685 0
996 1.3151 -3.3253 -7.0984 ... 3.7563 8.4052 2
997 -0.9177 0.7446 -4.8527 ... -2.3793 -0.4038 0
998 2.0385 -3.9001 -7.7472 ... 5.2290 9.2281 2
999 3.9357 -7.2564 5.7881 ... 1.2288 -2.2305 1
[1000 rows x 11 columns]
# fit your data with KMeans
# =====================================================
kmeans = KMeans(n_clusters=3)
kmeans.fit_predict(data.ix[:, :-1].values)
Out[70]: array([1, 0, 0, ..., 0, 1, 2], dtype=int32)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.