[英]Using Class Methods in Spark RDD Operations Returns Task not serializable Exception
Suppose I have the following class in Spark Scala: 假设我在Spark Scala中有以下课程:
class SparkComputation(i: Int, j: Int) {
def something(x: Int, y: Int) = (x + y) * i
def processRDD(data: RDD[Int]) = {
val j = this.j
val something = this.something _
data.map(something(_, j))
}
}
I get the Task not serializable Exception
when I run the following code: 运行以下代码时,出现“
Task not serializable Exception
:
val s = new SparkComputation(2, 5)
val data = sc.parallelize(0 to 100)
val res = s.processRDD(data).collect
I'm assuming that the exception occurs because Spark is trying to serialize the SparkComputation instance. 我假设发生异常是因为Spark正在尝试序列化SparkComputation实例。 To prevent this from happening, I have stored the class members I'm using in the RDD operation in local variables (
j
and something
). 为了防止这种情况的发生,我将在RDD操作中使用的类成员存储在局部变量(
j
和something
)中。 However, Spark still tries to serialize SparkComputation
object because of the method. 但是,由于该方法,Spark仍尝试序列化
SparkComputation
对象。 Is there anyway to pass the class method to map
without forcing Spark to serializing the whole SparkComputation
class? 无论如何,有没有传递类方法进行
map
而无需强制Spark序列化整个SparkComputation
类? I know the following code works without any problem: 我知道以下代码可以正常工作:
def processRDD(data: RDD[Int]) = {
val j = this.j
val i = this.i
data.map(x => (x + j) * i)
}
So, the class members that store values are not causing the problem. 因此,存储值的类成员不会导致此问题。 The problem is with the function.
问题出在功能上。 I have also tried the following approach with no luck:
我也尝试过以下方法,但是没有运气:
class SparkComputation(i: Int, j: Int) {
def processRDD(data: RDD[Int]) = {
val j = this.j
val i = this.i
def something(x: Int, y: Int) = (x + y) * i
data.map(something(_, j))
}
}
Make the class serializable: 使类可序列化:
class SparkComputation(i: Int, j: Int) extends Serializable {
def something(x: Int, y: Int) = (x + y) * i
def processRDD(data: RDD[Int]) = {
val j = this.j
val something = this.something _
data.map(something(_, j))
}
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.