简体   繁体   中英

Build a RDD LabeledPoint from a Spark RDD object on scala

I want to build a RDD[LabeledPoint] from a RDD object, the RDD object is the following :

 +-------------------+---------+--------------+--------+-------+
 |          date_time|site_name|posa_continent|year    |label  |
 +-------------------+---------+--------------+--------+-------+
 |2014-08-11 07:46:59|        2|             3|2014    |1      |
 |2014-08-11 08:22:12|        2|             3|2014    |2      |
 |2015-08-11 08:24:33|        2|             3|2015    |1      |
 |2016-08-09 18:05:16|        2|             3|2016    |3      |
 |2011-08-09 18:08:18|        2|             3|2011    |2      |
 |2009-08-09 18:13:12|        2|             3|2009    |1      |
 |2014-07-16 09:42:23|        2|             3|2014    |1      |
 +-------------------+---------+--------------+--------+-------+

I want to construct an RDD[LabeledPoint] with the label attribut in order to apply the KNN machine learning algorithm. I use the Spark Scala API.

在RDD上尝试map功能:

rddsObject.map(object => LabeledPoint(object.label, Vectors.dense(object.site_name,object.posa_continent,object.year))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM