简体繁体中英

Scala reading file with Spark

原文 2016-03-13 21:10:39 4 2 scala/ apache-spark

I am trying to read a file that look like this:

you 0.0432052044116
i 0.0391075831328
the 0.0328010698268
to 0.0237549924919
a 0.0209682886489
it 0.0198104294359

And I'd like to store it in a RDD (key,value) with (you,0.0432) for example. For the moment I only did that algorithm

val filename = "freq2.txt"
try {
for (line <- Source.fromFile(filename).getLines()) {
    val tuple = line.split(" ")
    val key = tuple(0)
    val words = tuple(1)
    println(s"${key}")
    println(s"${words}")
  }

} catch {
  case ex: FileNotFoundException => println("Couldn't find that file.")
  case ex: IOException => println("Had an IOException trying to read that file")
}

But I don't know how to store the data...

2 answers

You can directly read the data into an RDD:

val FIELD_SEP = " " //or whatever you have
val dataset = sparkContext.textFile(sourceFile).map(line => {
    val word::score::other = line.split(FIELD_SEP).toList
    (word, score)
})

val filename = "freq2.txt"
sc.textFile(filename).split("\\r?\\n").map(x =>{
                  var data = x.trim().split(" ")
                  (data(0), data(1))
          }).map(y => println(y));

reading compressed file in spark with scala

Reading binary file in Spark Scala

reading a file into array list in scala spark

Reading a text file from HDFS in Scala/Spark

Spark scala reading text file with map and filter

Error while reading a CSV file in Spark - Scala

spark scala reading text file with line delimiter

Reading a File from HDFS, Scala Spark

Bluemix Apache Spark Service - Scala - reading a file

How to remove footer from file while reading file in spark scala

暂无

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question reading compressed file in spark with scala Reading binary file in Spark Scala reading a file into array list in scala spark Reading a text file from HDFS in Scala/Spark Spark scala reading text file with map and filter Error while reading a CSV file in Spark - Scala spark scala reading text file with line delimiter Reading a File from HDFS, Scala Spark Bluemix Apache Spark Service - Scala - reading a file How to remove footer from file while reading file in spark scala

Related Tags

粤ICP备18138465号 © 2020-2024 STACKOOM.COM