使用scikit-learn进行聚类

Question

I'm working with scikit-learn for the first time and am trying to do a kmeans cluster. 我是第一次与scikit-learn合作，并试图建立kmeans集群。 I think I'm doing it all correctly. 我想我做的一切正确。

I have a datetime index and 2 columns of ints in a dataframe df . 我在数据帧df有一个日期时间索引和2列整数。

kmeans = KMeans(n_clusters=2)
kmeans.fit(df.values)

Then I have another set of data that looks the same and I want to predict it. 然后，我有另一组看起来相同的数据，我想对其进行预测。 So I pass df1 into kmeans.predict() . 所以我将df1传递给kmeans.predict() 。

Do I need to add some column to each of those dataframes for the classification? 我是否需要在每个这些数据框中添加一些列以进行分类？ I'm assuming everything I put into the fit is good. 我假设我投入的一切都很好。

After getting a classification completed, how do I then visualize it in a graph? 完成分类后，如何在图表中将其可视化？

Thanks 谢谢

Answer 1

Without seeing the data and assuming you want the resultant prediction as a column in the second data (df2) frame you can apply the kn.predict() using the .apply() function and specifying the vertical axis. 在不查看数据的情况下，并且假设您希望将结果预测作为第二个数据（df2）帧中的一列，您可以使用.apply（）函数并指定垂直轴来应用kn.predict（）。 This will give you an additional column with the predicted output. 这将为您提供带有预测输出的附加列。

ie 即

df2['predictions'] = df2['values'].apply(kmeans.predict)

Heres the info on apply. 这是适用的信息。 http://pandas.pydata.org/pandas-docs/version/0.17.1/generated/pandas.DataFrame.apply.html http://pandas.pydata.org/pandas-docs/version/0.17.1/generated/pandas.DataFrame.apply.html

Hope that helps. 希望能有所帮助。 Let me know if you need anything else. 需要帮助请叫我。

使用scikit-learn进行聚类

问题描述

1 个解决方案

解决方案1
0 2015-12-11 10:32:59

使用scikit-learn进行聚类

问题描述

1 个解决方案

解决方案1 0 2015-12-11 10:32:59

解决方案1
0 2015-12-11 10:32:59