简体   繁体   English

scikit-learn 准备

[英]scikit-learn preperation

I am trying to use the scikit-learn package for semi supervised classification, I have a file with classes, instances and features but I am not sure how to prepare this file for scikit-learn .我正在尝试使用scikit-learn包进行半监督分类,我有一个包含类、实例和特征的文件,但我不确定如何为scikit-learn准备这个文件。 Could you give some guidelines for file preparation?你能给一些文件准备的指导吗? The tutorial only provide instructions for uploading prepared data sets from machine learning repositories.本教程仅提供有关从机器学习存储库上传准备好的数据集的说明。 Thank you!谢谢!

Scikit-learn directly supports special learning-oriented input formats, notably SVMLight . Scikit-learn 直接支持特殊的面向学习的输入格式,特别是SVMLight But in general, its input is a numpy array (when dense), which can be produced from a diverse range of data sources using other tools from the SciPy stack, notably scipy.io , and more pertinently in the case of a text file with columns, Pandas IO tools .但总的来说,它的输入是一个 numpy 数组(密集时),可以使用 SciPy 堆栈中的其他工具(特别是scipy.io )从各种数据源生成,并且在文本文件的情况下更相关列, Pandas IO 工具 You can likely use pandas.read_csv followed by pulling out, and dropping from the feature set, the target class column.您可能可以使用pandas.read_csv然后从特征集中提取和删除目标类列。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM