[英]Transfer Scala case class to JsValue in rdd.map func but Task not serializable
I am new to Scala/Spark and I have RDD of case class我是 Scala/Spark 的新手,我有案例 class 的 RDD
case class Info(key1 : String, key2 : String, key3 : String)
I want to transfer RDD[Info] into RDD[JsString] and save it to ElasticSearch, I use play.api.libs and define write converter:我想将 RDD[Info] 转成 RDD[JsString] 并保存到 ElasticSearch,我使用 play.api.libs 并定义写入转换器:
implicit val InfoWrites = new Writes[Info]{
def writes(i : Info): JsObject = Json.obj(
"key1" -> i.key1,
"key2" -> i.key2,
"key3" -> i.key3
)
}
then I define implicit class to use save func:然后我定义隐式 class 以使用保存功能:
implicit class Saver(rdd : RDD[Info]) {
def save() : Unit = {
rdd.map{ i => Json.toJson(i).toString }.saveJsonToEs("resource"))
}
}
So I can save RDD[Info] with所以我可以用 RDD[Info] 保存
infoRDD.save()
But I keep get the "Task not serializable" error with Json.toJson() in rdd.map()但我一直在 rdd.map() 中收到 Json.toJson()的“任务不可序列化”错误
I also try to define serializeable object like this我也尝试像这样定义可序列化的 object
object jsonUtils extends Serializable{
def toJsString(i : Info) : String = {
Json.toJson(i).toString()
}
}
rdd.map{ i => jsonUtils.toJsString(i) }
but keep getting error "Task not serializable"但不断收到错误“任务不可序列化”
How to change the code?如何更改代码? Thank you !
谢谢 !
I ran the below code, similar to your code and it works for me:我运行了下面的代码,类似于你的代码,它对我有用:
import models.Info
import org.apache.spark.rdd.RDD
import play.api.libs.json.Json
import domain.utils.Implicits._
class CustomFunctions(rdd : RDD[Info]) {
def save() = {
rdd.map(i => Json.toJson(i).toString ).saveAsTextFile("/home/training/so-123")
}
}
Wrote the corresponding Implicits
:写了相应的
Implicits
:
package domain.utils
import play.api.libs.json.{JsObject, Json, Writes}
import models.Info
class Implicits {
implicit val InfoWrites = new Writes[Info]{
def writes(i : Info): JsObject = Json.obj(
"key1" -> i.key1,
"key2" -> i.key2,
"key3" -> i.key3
)
}
}
object Implicits extends Implicits
Created the model Info
:创建了 model
Info
:
package models
case class Info(key1 : String, key2 : String, key3 : String)
Created a SparkOperationsDao
to compose and create spark context:创建了一个
SparkOperationsDao
来组合和创建 spark 上下文:
package dao
import domain.utils.CustomFunctions
import models.Info
import org.apache.spark.{SparkConf, SparkContext}
class SparkOperationsDao {
val conf:SparkConf = new SparkConf().setAppName("driverTrack").setMaster("local")
val sc = new SparkContext(conf)
def writeToElastic() = {
val sample = List(Info("name1", "city1", "123"), Info("name2", "city2", "234"))
val rdd = sc.parallelize(sample)
val converter = new CustomFunctions(rdd)
converter.save()
}
}
object SparkOperationsDao extends SparkOperationsDao
Run the App:运行应用程序:
import dao.SparkOperationsDao
object RapidTests extends App {
SparkOperationsDao.writeToElastic()
//.collect.foreach(println)
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.