简体   繁体   中英

Scala/Spark App with “No TypeTag available” Error in “def main” style App

I'm new to Scala/Spark stack and I'm trying to figure out how to test my basic skills using SparkSql to "map" RDDs in TempTables and viceversa.

I have 2 distinct .scala files with the same code: a simple object (with def main...) and an object extending App.

In the simple object one I get an error due to "No TypeTag available" connected to my case class Log:

object counter {
  def main(args: Array[String]) {
.
.
.
   val sqlContext = new org.apache.spark.sql.SQLContext(sc)
   import sqlContext.createSchemaRDD
   case class Log(visitatore: String, data: java.util.Date, pagina: String, count: Int)
   val log = triple.map(p => Log(p._1,p._2,p._3,p._4))
   log.registerTempTable("logs")
   val logSessioni= sqlContext.sql("SELECT visitor, data, pagina, count FROM logs")
   logSessioni.foreach(println)
}

The error at line: log.registerTempTable("logs") says "No TypeTag available for Log".

In the other file (object extends App) all works fine:

object counterApp extends App {
.
.
.
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
    import sqlContext.createSchemaRDD
    case class Log(visitatore: String, data: java.util.Date, pagina: String, count: Int)
    val log = triple.map(p => Log(p._1,p._2,p._3,p._4))
    log.registerTempTable("logs")
    val logSessioni= sqlContext.sql("SELECT visitor, data, pagina, count from logs")
    logSessioni.foreach(println)
}

Since I've just started, I'm not getting two main points: 1) Why does the same code work fine in the second file (object extend App) while in the first one (simple object) I get the error?

2) (and most important) What should I do in my code (simple object file) to fix this error in order to deal with case class and TypeTag (which I barely know)?

Every answer, code examples will be much appreciated!

Thanks in advance

FF

TL;DR;

Just move your case class out of the method definition

The problem is that your case class Log is defined inside of the method that it is being used. So, simply move your case class definition outside of the method and it will work. I will have to take a look at how this compiles down, but my guess is that this is more of a chicken-egg problem. The TypeTag (used for reflection) is not able to be implicitly defined as it has not been fully defined at that point. Here are two SO questions with the same problem that exhibit that Spark would need to use a WeakTypeTag . And, here is the JIRA explaining this more officially

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM