简体   繁体   中英

How to use non udf method in Spark?

I've code whose which is like below

myDF.map{ x =>

  val inp = MyUtils.doSomething(x.value) //accepts Int values and return Int
  MyInfo(inp)

}

here MyUtils.doSomething is normal function (non UDF) in my spark scala code. Its working fine

But when I do like this

   val DF = myDF.withColumn("value", lit(MyUtils.doSomething(col("value").asInstanceOf[Int].toInt)))

why its showing error

class org.apache.spark.sql.Column cannot be cast to class java.lang.Integer

How can I fix this? Is there any way I could get underlying value of col("value") , so that I could use this in my doSomething function.

Not sure why col("value").asInstanceOf[Int].toInt its not giving Int value?

Not sure why col("value").asInstanceOf[Int].toInt its not giving Int value?

Well because how do you want to cast Column("colName", 21, false) ? That asInstanceOf will basically make compiler ignore the fact that an object of type Column is an integer, and you'll face exceptions in runtime instead. You should program your code in a way that you won't even need asInstanceOf . About your first consideration, that UDF is basically a function, serialized by spark into spark slaves and executed on columns, so you'll have to do it like this:

import org.apache.spark.sql.functions._
val doSomethingUdf = udf(MyUtils.doSomething)
// if doSomething is defined as a method "def doSomething ..."
// then it would be better to do
// udf(MyUtils.doSomething _)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM