简体   繁体   中英

What import is missing to make Spark's MLlib linear regression in Scala example work?

Using Spark v1.0-rc3 - When implementing MLlib's linear regression I get an error. So eventually I tried copy/pasting from Spark's MLlib example code for linear regression in Scala and I still receive the error:

scala> val parsedData = data.map { line => val parts = line.split(',') LabeledPoint(parts(0).toDouble, parts(1).split(' ').map(x => x.toDouble).toArray) } <console>:28: error: polymorphic expression cannot be instantiated to expected type; found : [U >: Double]Array[U] required: org.apache.spark.mllib.linalg.Vector LabeledPoint(parts(0).toDouble, parts(1).split(' ').map(x => x.toDouble).toArray)

The error states that org.apache.spark.mllib.linalg.Vector is required, but importing it does not help. Even when trying multiple methods of casting to a Vector I get

<console>:19: error: type mismatch; found : scala.collection.immutable.Vector[Array[Double]]

The problem is due to changes to the later version. The code that once worked in v0.91 now requires tweaking for v1.0. You can find the latest docs here The solution is to add Vectors not Vector despite what the error tells you. Try:

import org.apache.spark.mllib.regression.LinearRegressionWithSGD
import org.apache.spark.mllib.regression.LabeledPoint
import org.apache.spark.mllib.linalg.Vectors

// Load and parse the data
val data = sc.textFile("mllib/data/ridge-data/lpsa.data")
val parsedData = data.map { line =>
  val parts = line.split(',')
  LabeledPoint(parts(0).toDouble, Vectors.dense(parts(1).split(' ').map(x => x.toDouble)))
  }

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM