简体   繁体   中英

How to store the result of an action in apache spark using scala

How to store the result generated from an action like: count in an output directory, in apache Spark Scala?

    val countval= data.map((_,"")).reduceByKey((_+_)).count

The below command does not work as count is not stored as RDD:

    countval.saveAsTextFile("OUTPUT LOCATION")

Is there any way to store countval into local/hdfs location?

After you call count it is no longer RDD.

Count is just Long and it does not have saveAsTextFile method.

If you want to store your countval you have to do it like with any other long, string, int...

what @szefuf said is correct, after count you have a Long which you can save any way you want. If you want to save it as an RDD with .saveAsTextFile() you have to convert it to an RDD:

 sc.parallelize(Seq(countval)).saveAsTextFile("/file/location")

The parallelize method in SparkContext turns a collection of values into an RDD, so you need to turn the single value to a single-element sequence first. Then you can save it.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM