简体   繁体   中英

Convert Dataframe with Vector column to Dataset - which type to be used in the case class

I have a dataframe with a column of vector type as a result from onehot encoder. Let's name the column Vector .

With a case class Example(vector: WhichType) , I want to map the dataframe to a Dataset:

val ds = dataframe.as[Example]

Question is: Which type should the property 'vector' in the case class have.

I get an error message:

need an array field but got structtype:tinyint,size:int,indices:array<int,values:array>;

If you're using Spark ML, then you can use the Vector type imported below:

import org.apache.spark.ml.linalg.Vector

case class Example(vector: Vector)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM