简体   繁体   English

java.lang.IllegalArgumentException:未指定'path'// Spark使用者问题

[英]java.lang.IllegalArgumentException: 'path' is not specified // Spark Consumer Issue

I am trying to create SparkConsumer so I can send messeges in this case a csv file to Kafka through Spark Streaming. 我正在尝试创建SparkConsumer,以便在这种情况下可以通过Spark Streaming将CSV文件发送到Kafka。 But I have an error that 'path' is not specified. 但是我有一个错误,未指定“路径”。 See my code below 请参阅下面的代码

My code is as follows: 我的代码如下:

import org.apache.log4j.{Level, Logger}
import org.apache.spark.{SparkConf, SparkContext}
import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.execution.streaming.FileStreamSource.Timestamp
import org.apache.spark.sql.types.{StringType, StructField, StructType}
import org.apache.spark.sql.functions.from_json
import org.apache.spark.sql.streaming.OutputMode

object sparkConsumer extends App {

  val conf = new SparkConf().setMaster("local").setAppName("Name")
  val sc = new SparkContext(conf)

  val rootLogger = Logger.getRootLogger()
  rootLogger.setLevel(Level.ERROR)

  val spark = SparkSession
    .builder()
    .appName("Spark-Kafka-Integration")
    .master("local")
    .getOrCreate()

  val schema = StructType(Array(
    StructField("InvoiceNo", StringType, nullable = true),
    StructField("StockCode", StringType, nullable = true),
    StructField("Description", StringType, nullable = true),
    StructField("Quantity", StringType, nullable = true)
  ))

  val streamingDataFrame = spark.readStream.schema(schema).csv("C:/Users/me/Desktop/Tasks/Tasks1/test.csv")

  streamingDataFrame.selectExpr("CAST(InvoiceNo AS STRING) AS key", "to_json(struct(*)) AS value").
    writeStream
    .format("csv")
    .option("topic", "topic_test")
    .option("kafka.bootstrap.servers", "localhost:9092")
    .option("checkpointLocation", "C:/Users/me/IdeaProjects/SparkStreaming/checkpointLocation/")
    .start()

  import spark.implicits._
  val df = spark
    .readStream
    .format("kafka")
    .option("kafka.bootstrap.servers", "localhost:9092")
    .option("subscribe", "topic_test")
    .load()

  val df1 = df.selectExpr("CAST(value AS STRING)", "CAST(timestamp AS TIMESTAMP)").as[(String, Timestamp)]
    .select(from_json($"value", schema).as("data"), $"timestamp")
    .select("data.*", "timestamp")

  df1.writeStream
    .format("console")
    .option("truncate","false")
    .outputMode(OutputMode.Append)
    .start()
    .awaitTermination()

}

I become the following error: 我变成以下错误:

Exception in thread "main" java.lang.IllegalArgumentException: 'path' is not specified

Does anyone know what I am missing? 有人知道我在想什么吗?

It seems that it can be a problem on this part of your code: 在这部分代码看来,这可能是一个问题:

  streamingDataFrame.selectExpr("CAST(InvoiceNo AS STRING) AS key", "to_json(struct(*)) AS value").
    writeStream
    .format("csv")
    .option("topic", "topic_test")
    .option("kafka.bootstrap.servers", "localhost:9092")
    .option("checkpointLocation", "C:/Users/me/IdeaProjects/SparkStreaming/checkpointLocation/")
    .start()

because you use use a "csv" format but you don´t set the file location that it needs. 因为您使用的是“ csv”格式,但未设置所需的文件位置。 Instead you configure Kafka properties to use a kafka topic as your sink. 相反,您可以配置Kafka属性以将kafka主题用作接收器。 So if you change the format to "kafka" it should work. 因此,如果将格式更改为“ kafka”,则应该可以使用。

Another problem you can experiment using csv as source is that your path should be a directory not file. 您可以尝试使用csv作为源的另一个问题是您的路径应该是目录而不是文件。 In your case, if you create a directory and move your csv file it will work. 就您而言,如果您创建目录并移动csv文件,它将起作用。

Just for testing, create a directoy named C:/Users/me/Desktop/Tasks/Tasks1/test.csv and create a file with the name part-0000.csv inside. 仅出于测试目的,创建一个名为C:/Users/me/Desktop/Tasks/Tasks1/test.csv的目录,并创建一个内部名为part-0000.csv的文件。 Then include your csv content in this new file and start again the process. 然后将您的csv内容包含在这个新文件中,然后再次开始该过程。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 火花提交 java.lang.IllegalArgumentException:无法从空字符串创建路径 - spark submit java.lang.IllegalArgumentException: Can not create a Path from an empty string java.lang.IllegalArgumentException: 非法序列边界 Spark - java.lang.IllegalArgumentException: Illegal sequence boundaries Spark Spark:java.lang.IllegalArgumentException:需求失败的kmeans(mllib) - Spark: java.lang.IllegalArgumentException: requirement failed kmeans (mllib) Spark Scala MLlib 异常:java.lang.IllegalArgumentException - Spark Scala MLlib exception : java.lang.IllegalArgumentException Scala中的Java时间戳记问题-获得java.lang.IllegalArgumentException错误 - Java Timestamp Issue in Scala - java.lang.IllegalArgumentException Error obtained Spark 1.6:java.lang.IllegalArgumentException:spark.sql.execution.id已设置 - Spark 1.6: java.lang.IllegalArgumentException: spark.sql.execution.id is already set Lagom服务器:java.lang.IllegalArgumentException - Lagom Server: java.lang.IllegalArgumentException SBT左轮手枪中的java.lang.IllegalArgumentException - java.lang.IllegalArgumentException in SBT revolver Spark:java.lang.IllegalArgumentException:URI s3:///中无效的主机名<bucket-name> - Spark: java.lang.IllegalArgumentException: Invalid hostname in URI s3:///<bucket-name> Spark 2.2 非法模式组件:XXX java.lang.IllegalArgumentException:非法模式组件:XXX - Spark 2.2 Illegal pattern component: XXX java.lang.IllegalArgumentException: Illegal pattern component: XXX
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM