简体   繁体   中英

Why do I get a type mismatch error when using a UDF that returns an object of type Option[Long]?

I am trying to write a User Defined Function (UDF) in Scala that handles null values. For my example, I'm trying to return the epoch for a column if the value is not null. I found that Option[] is used to return nulls from a udf.

Here is my UDF:

def to_epoch(date: Timestamp) : Option[Long] = {
    if(date != null) {
        Option.apply(date.getTime)
    } else {
        Option.empty
    }
}

val toEpoch: (Timestamp => Option[Long]) => UserDefinedFunction = udf((_: Timestamp => Option[Long]))

I am creating a dataframe from a file that is read as follows, and I want to add the column "dateEpoch". I don't know how to make it handle the Option[Long] that my udf returns:

spark.read
     .schema(ListeningStatsSchema.schema)
     .json(location)
     .withColumn("dateEpoch", toEpoch(col("EventTS"))

The error I get is:

type mismatch;
 found   : org.apache.spark.sql.Column
 required: java.sql.Timestamp => Option[Long]
            .withColumn("opd", toEpoch(col("event_TS")))

The error you get means that the function you defined expects a Timestamp (cf the type provided by the REPL). Yet you are providing a Column , hence the error. The Column type is the main type you manipulate with Spark SQL. You can work with it with either predefined functions and operators (columns can be added with + for instance), or UDFs, but not with regular scala functions.

To fix your code, you need to transform your function to a spark UDF using the udf function. You can do it like this:

val to_epoch_udf = udf(to_epoch _)

// And we can try it:
spark.range(1).select(to_epoch_udf(current_timestamp)).show

which gives:

+------------------------+
|UDF(current_timestamp())|
+------------------------+
|1599492185730           |
+------------------------+

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM