[英]Spark Structured Streaming for appending to text file using foreach
I want to append lines to a text file using structured streaming. 我想使用结构化流向文本文件追加行。 This code results in
SparkException: Task not serializable
. 这段代码导致
SparkException: Task not serializable
。 I think toDF
is not allowed. 我认为
toDF
是不允许的。 How could I get this code to work? 我如何才能使此代码正常工作?
df.writeStream
.foreach(new ForeachWriter[Row] {
override def open(partitionId: Long, version: Long): Boolean = {
true
}
override def process(row: Row): Unit = {
val df = Seq(row.getString(0)).toDF
df.write.format("text").mode("append").save(output)
}
override def close(errorOrNull: Throwable): Unit = {
}
}).start
You cannot call df.write.format("text").mode("append").save(output)
inside process
method. 您不能在
process
方法内部调用df.write.format("text").mode("append").save(output)
。 It will run in the executor side. 它将在执行程序端运行。 You can use the file sink instead, such as
您可以改用文件接收器,例如
df.writeStream.format("text")....
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.