简体   繁体   中英

Spark Scala Vector Map ClassCastException

Trying to make use of the Statistics tool in Spark for Scala and having difficulty preparing a vector that will take.

val featuresrdd =   features.rdd.map{_.getAs[Vector]("features")}

featuresrdd: org.apache.spark.rdd.RDD[org.apache.spark.mllib.linalg.Vector] = MapPartitionsRDD[952] at map at <console>:82

This produces a vector of type 'mllib.linalg.vector', however upon using this in the tool, the vector has changed to type 'DenseVector'.

import org.apache.spark.mllib.linalg._
import org.apache.spark.mllib.stat.Statistics
import org.apache.spark.rdd.RDD

val correlMatrix: Matrix = Statistics.corr(featuresrdd, "pearson")

java.lang.ClassCastException: org.apache.spark.ml.linalg.DenseVector cannot be cast to org.apache.spark.mllib.linalg.Vector

Any help would be greatly appreciated.

Thanks

Use asML function to convert old Vector to new Vector in ML:

val newMLFeaturesRDD = featuresrdd.map(_.asML)
val correlMatrix: Matrix = Statistics.corr(newMLFeaturesRDD , "pearson")

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM