val trim: String => String = _.trim.replace("[\\r\\n]", "")
def main(args: Array[String]) {
val spark = ... ...
import spark.implicits._
val trimUDF = udf[String,String](trim)
val df = spark.read.json(df_path) ...
val fixed_dblogs_df = df.withColumn("qp_new", trimUDF('qp)) ...
}
When I run this code I get a compile time error:
No TypeTag available for String
This error is where I define the udf function. I have no idea why this is happening. I have used udf functions before but this one is making this error. I used Spark 2.1.1 and that's it.
The purpose of the code is to remove all the new lines in one of my fields of columns that is StringType and I just want it to not have any newlines in it
Is there some reason you're using a UDF instead of the replace_regexp
builtin?
val fixed_dblogs_df = df.withColumn("qp_new", replace_regexp('qp, "[\\r\\n]", "") ...)
UDF's break Spark's plan optimization.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.