简体   繁体   English

Scala 带标记点的多类分类

[英]Scala Multiclass classification with labeled point

I have a multiclass classification problem I'm looking to sort with logistic regression.我有一个多类分类问题,我希望用逻辑回归进行排序。 I know this can also be tackled by decision trees and random forest, but wish to stick specifically with "LogisticRegressionWithLBFGS".我知道这也可以通过决策树和随机森林来解决,但希望特别坚持使用“LogisticRegressionWithLBFGS”。 I have all the data tidying done.我已经完成了所有数据整理。 I have my data nice and tidy in a dataframe with a: label field (String), a feature vector (vector of features/ numbers) and a third column "LabelIndex" (numbers representing the class).我的数据在 dataframe 中整洁有序,其中包含:label 字段(字符串)、特征向量(特征/数字的向量)和第三列“LabelIndex”(代表类的数字)。

When I do a train test split on the data frame and try to fit it to: LogisticRegressionWithLBFGS当我对数据框进行训练测试拆分并尝试使其适合: LogisticRegressionWithLBFGS

val model = new LogisticRegressionWithLBFGS().setNumClasses(10).setIntercept(true).setValidateData(true).run("trainingData")

It doesn't like the "run" part.它不喜欢“运行”部分。

The example I am working off, loads a data file in via:我正在处理的示例通过以下方式加载数据文件:

val data = MLUtils.loadLibSVMFile(Spark.sparkContext, "data/mnist.bz2")

(i'm trying to copy the example, and slot in my own data. But its in a different format, looks different etc) I was doing a bit of reading, and I'd come across, I need to convert my dataframe to a RDD[LabeledPoint]. (我正在尝试复制示例,并插入我自己的数据。但它的格式不同,看起来不同等)我正在阅读一些内容,我遇到了,我需要将我的 dataframe 转换为一个 RDD [标签点]。 I need to map it.我需要 map 它。

I'm having problems finding good info on how to do this.我在寻找有关如何执行此操作的好信息时遇到问题。

How do I simply convert a Dataframe with 3 fields as described above, "Label" (String), "Features" (feature vector), "IndexedLabel" (Double) into a RDD[LabeledPoint]?如何简单地将具有 3 个字段的 Dataframe 转换为 RDD[LabeledPoint]?

Got it working:得到它的工作:

Can't convert Dataframe to Labeled Point 无法将 Dataframe 转换为标记点

This link showed me how to make the conversion successfully.此链接向我展示了如何成功进行转换。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM