I am trying to use collectAsMap()
in the following statement:
import org.apache.spark.SparkConf
import org.apache.spark.SparkContext
import org.apache.spark.broadcast.Broadcast
import org.apache.spark.rdd.RDD
...
documents_input.
filter(_ != documents_header).
map(_.split(",")).
map(Document.parse(_)).
keyBy((_.id)).collectAsMap()
However I am getting the following error:
value collectAsMap is not a member of org.apache.spark.rdd.RDD[(Int, `com.codependent.MyApp.Document)]`
Any idea why or how I could turn the Array into a Map?
Fixed after updating the imports as Ram Ghadiyaram suggested:
import org.apache.spark.SparkConf
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.broadcast.Broadcast
import org.apache.spark.rdd.RDD
It depends on how you read the documents_input
. If you read it using sparkContext
then you should be able to use collectAsMap
. But if you have read the documents_input
as Source
or any other scala api
then collectAsMap
won't do the trick. In that case you can use toMap
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.