简体   繁体   中英

java.lang.RuntimeException: Unsupported literal type class org.joda.time.DateTime

I work on a project where I use a library, which is very new to me, although I was using it in other projects, without any problems.

org.joda.time.DateTime

So I work with Scala , and run the project as a job on Databricks .

scalaVersion := "2.11.12"

The code, where the exception comes from - according to my investigation so far ^^ - is the following:

    var lastEndTime = config.getState("some parameters")

    val timespanStart: Long = lastEndTime // last query ending time
    var timespanEnd: Long = (System.currentTimeMillis / 1000) - (60*840) // 14 hours ago

    val start = new DateTime(timespanStart * 1000)
    val end = new DateTime(timespanEnd * 1000)

    val date = DateTime.now()

Where the getState() function returns 1483228800 as Long type value.

EDIT : I use the start and end dates in filtering while building a dataframe. I compare columns (timespan type) with these values!

val df2= df
           .where(col("column_name").isNotNull)
           .where(col("column_name") > start &&
                  col("column_name") <= end)

The error I get:

ERROR Uncaught throwable from user code: java.lang.RuntimeException: Unsupported literal type class org.joda.time.DateTime 2017-01-01T00:00:00.000Z

I am not sure I actually understand how and why this is an error, so every kind of help is more than welcome!! Thank you a lot in advance!!

This is common problem when people start to work with Spark SQL. Spark SQL has its own types and you need to work with them if you want to take advantage of the Dataframe API. In your example you can not compare a Dataframe column value using a Spark Sql function like " col " with a DateTime object directly unless you use an UDF.

If you want to make your comparison using the Spark sql functions you can take a look to this post where you can find differences using Dates and Timestamps with Spark Dataframes.

If you (for any reason) need to use Joda you will inevitably need to build your UDF:

import org.apache.spark.sql.DataFrame
import org.joda.time.DateTime
import org.joda.time.format.{DateTimeFormat, DateTimeFormatter}

object JodaFormater {
  val formatter: DateTimeFormatter = DateTimeFormat.forPattern("dd/MM/yyyy HH:mm:ss")
}

object testJoda {

  import org.apache.spark.sql.functions.{udf, col}
  import JodaFormater._

  def your_joda_compare_udf = (start: DateTime) => (end: DateTime) => udf { str =>
    val dt: DateTime = formatter.parseDateTime(str)
    dt.isAfter(start.getMillis) && dt.isBefore(start.getMillis)
  }

  def main(args: Array[String]) : Unit = {

    val start: DateTime = ???
    val end : DateTime = ???

    // Your dataframe with your date as StringType

    val df: DataFrame = ???
    df.where(your_joda_compare_udf(start)(end)(col("your_date")))

  }
}

Note that using this implementation implies some overhead(memory and GC) because the conversion from StringType to a Joda DateTime object so you should use the Spark SQL functions whenever you can . In some posts you can read that udfs are black boxes because Spark can not optimize their execution, but sometimes they help.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM