简体   繁体   English

Spark 结构化流 -

[英]Spark Structured Streaming -

I'm trying to run the following code from IntelliJ idea to print messages from Kafka to console.我正在尝试从 IntelliJ idea 运行以下代码以将消息从 Kafka 打印到控制台。 But it throws the following error -但它会引发以下错误 -

Exception in thread "main" org.apache.spark.sql.AnalysisException: Queries with streaming sources must be executed with writeStream.start();;

Stacktrace started from Dataset.checkpoint and way up. Stacktrace 从 Dataset.checkpoint 开始Dataset.checkpoint上。 If I remove .checkpoint() then I get some other error - related to permission如果我删除.checkpoint()然后我会收到一些其他错误 - 与权限相关

17/08/02 12:10:52 ERROR StreamMetadata: Error writing stream metadata StreamMetadata(4e612f22-efff-4c9a-a47a-a36eb533e9d6) to C:/Users/rp/AppData/Local/Temp/temporary-2f570b97-ad16-4f00-8356-d43ccb7660db/metadata
java.io.IOException: (null) entry in command string: null chmod 0644 C:\Users\rp\AppData\Local\Temp\temporary-2f570b97-ad16-4f00-8356-d43ccb7660db\metadata

Source:来源:

def main(args : Array[String]) = {
 val spark = SparkSession.builder().appName("SparkStreaming").master("local[*]").getOrCreate()
  val canonicalSchema = new StructType()
                          .add("cid",StringType)
                          .add("uid",StringType)
                          .add("sourceSystem",
                              new StructType().add("id",StringType)
                                              .add("name",StringType))
                          .add("name", new StructType()
                                        .add("firstname",StringType)
                                        .add("lastname",StringType))


val messages = spark
                    .readStream
                    .format("kafka")
                    .option("kafka.bootstrap.servers","localhost:9092")
                    .option("subscribe","c_canonical")
                    .option("startingOffset","earliest")
                    .load()
                    .checkpoint()
.select(from_json(col("value").cast("string"),canonicalSchema))
.writeStream.outputMode("append").format("console").start.awaitTermination

 }

Can anyone please help me understand where I'm doing wrong?谁能帮我理解我做错了什么?

  1. Structured Streaming doesn't support Dataset.checkpoint() .结构化流不支持Dataset.checkpoint() There is an open ticket to provide a better message or just ignore it: https://issues.apache.org/jira/browse/SPARK-20927有一张公开票可以提供更好的信息或忽略它: https : //issues.apache.org/jira/browse/SPARK-20927

  2. IOException probably is because you don't install cygwin on Windows. IOException 可能是因为您没有在 Windows 上安装 cygwin。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM