简体   繁体   English

Spark结构化流,用于使用foreach附加到文本文件

[英]Spark Structured Streaming for appending to text file using foreach

I want to append lines to a text file using structured streaming. 我想使用结构化流向文本文件追加行。 This code results in SparkException: Task not serializable . 这段代码导致SparkException: Task not serializable I think toDF is not allowed. 我认为toDF是不允许的。 How could I get this code to work? 我如何才能使此代码正常工作?

df.writeStream
  .foreach(new ForeachWriter[Row] {
    override def open(partitionId: Long, version: Long): Boolean = {
      true 
    }

    override def process(row: Row): Unit = {
       val df = Seq(row.getString(0)).toDF

       df.write.format("text").mode("append").save(output)
    } 

    override def close(errorOrNull: Throwable): Unit = {
    }      
  }).start

You cannot call df.write.format("text").mode("append").save(output) inside process method. 您不能在process方法内部调用df.write.format("text").mode("append").save(output) It will run in the executor side. 它将在执行程序端运行。 You can use the file sink instead, such as 您可以改用文件接收器,例如

df.writeStream.format("text")....

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM