I want to build a RDD[LabeledPoint]
from a RDD
object, the RDD object is the following :
+-------------------+---------+--------------+--------+-------+
| date_time|site_name|posa_continent|year |label |
+-------------------+---------+--------------+--------+-------+
|2014-08-11 07:46:59| 2| 3|2014 |1 |
|2014-08-11 08:22:12| 2| 3|2014 |2 |
|2015-08-11 08:24:33| 2| 3|2015 |1 |
|2016-08-09 18:05:16| 2| 3|2016 |3 |
|2011-08-09 18:08:18| 2| 3|2011 |2 |
|2009-08-09 18:13:12| 2| 3|2009 |1 |
|2014-07-16 09:42:23| 2| 3|2014 |1 |
+-------------------+---------+--------------+--------+-------+
I want to construct an RDD[LabeledPoint]
with the label
attribut in order to apply the KNN machine learning algorithm. I use the Spark Scala API.
在RDD上尝试map
功能:
rddsObject.map(object => LabeledPoint(object.label, Vectors.dense(object.site_name,object.posa_continent,object.year))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.