简体   繁体   中英

How to create an RDD that has data type of string?

I have this line of code:

<scala> val quoteRDD = sc.parallelize("\"")
quoteRDD: org.apache.spark.rdd.RDD[Char] = ParallelCollectionRDD[0] at parallelize

How can have this RDD that holds the "\\"" as a string data type? Spark says its a char, but I am needing a string data type.

Can you help me with this change?

Thanks

SparkContext.parallelize has following signature

def parallelize[T](seq: Seq[T], numSlices: Int = defaultParallelism)(implicit arg0: ClassTag[T]): RDD[T] 

and String can be substituted ( with implicit conversions ) for Seq[Char] .

If you really want to create single element RDD[String] (not much use for that, but let's call it an exercise) add Seq wrapper:

val quoteRDD = sc.parallelize(Seq("\""))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM