简体   繁体   English

如何将numpy数组转换为libsvm格式

[英]How to convert numpy array into libsvm format

I have a numpy array for an image and am trying to dump it into the libsvm format of LABEL I0:V0 I1:V1 I2:V2..IN:VN . 我有一个图像的numpy数组,并试图将其转储为LABEL I0:V0 I1:V1 I2:V2..IN:VN的libsvm格式。 I see that scikit-learn has a dump_svmlight_file and would like to use that if possible since it's optimized and stable. 我看到scikit-learn有一个dump_svmlight_file并希望尽可能使用它,因为它已经过优化和稳定。

It takes parameters of X, y, and file output name. 它采用X,y和文件输出名称的参数。 The values I'm thinking about would be: X - numpy array y - ???? 我正在考虑的值将是:X-numpy数组y-???? file output name - self-explanatory 文件输出名称-不言自明

Would this be a correct assumption for X? 对于X,这将是正确的假设吗? I'm very confused about what I should do for y though. 我对自己应该做什么感到很困惑。 It appears it needs to be a feature set of some kind. 看来它必须是某种功能集。 I don't know how I would go about obtaining that however. 我不知道我该如何去获得它。 Thanks in advance for the help! 先谢谢您的帮助!

The svmlight format is tailored to classification/regression problems. svmlight格式适合于分类/回归问题。 Therefore, the array X is a matrix with as many rows as data points in your set, and as many columns as features. 因此,数组X是一个矩阵,其中行与集合中的数据点一样多,列与要素一样多。 y is the vector of instance labels. y是实例标签的向量。

For example, suppose you have 1000 objects (images of bicycles and bananas, for example), featurized in 400 dimensions. 例如,假设您有1000个对象(例如,自行车和香蕉的图像)以400个维度进行了特征化。 X would be 1000x400, and y would be a 1000-vector with a 1 entry where there should be a bicycle, and a -1 entry where there should be a banana. X将是1000x400,而y将是一个1000向量,其中应该有一辆自行车的入口为1,而应该有香蕉的入口为-1。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM