Code -
val rdd=sc.textFile("/tmp/abc.csv")
rdd.first.split(",").zipWithIndex
val rows=rdd.filter(x => !x.contains("ID") && !x.contains("Case Number"))
val split1=rows.map(x => x.split(","))
split1.take(3)
import java.time._
import java.time.format._
val format=DateTimeFormatter.ofPattern("MM/dd/yyyy h:m:s a")
val dates=split1.map( x => LocalDateTime.parse( x(2) , format))
Error:
org.apache.spark.SparkException: Task not serializable at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:304)
Rather ugly way to handle this is to push format initialization inside anonymous function:
split1.map(x =>
LocalDateTime.parse(x(2), DateTimeFormatter.ofPattern("MM/dd/yyyy h:m:s a")))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.