简体   繁体   English

在 Spark Structured Streaming 中遇到问题

[英]facing an issue in Spark Structured Streaming

I have written a code to read a csf file and printing that on console using Spark Stuctured Stream.我已经编写了一个代码来读取一个 csf 文件并使用 Spark Stuctured Stream 在控制台上打印它。 Code is below -代码如下 -


    import java.util.ArrayList;
    import java.util.List;

    import org.apache.spark.api.java.function.FlatMapFunction;
    import org.apache.spark.sql.*;
    import org.apache.spark.sql.streaming.StreamingQuery;
    import org.apache.spark.sql.Encoders;
    import org.apache.spark.sql.types.StructType;
    import com.cybernetix.models.BaseDataModel;

    public class ReadCSVJob   {

        static List<BaseDataModel>  bdmList=new ArrayList<BaseDataModel>();

        public static void main(String args[]) {

             SparkSession spark = SparkSession
                      .builder()
                      .config("spark.eventLog.enabled", "false")
                      .config("spark.driver.memory", "2g")
                      .config("spark.executor.memory", "2g")
                      .appName("StructuredStreamingAverage")
                      .master("local")
                      .getOrCreate();



            StructType userSchema = new StructType();
            userSchema.add("name", "string");
            userSchema.add("status", "String");
            userSchema.add("u_startDate", "String");
            userSchema.add("u_lastlogin", "string");
            userSchema.add("u_firstName", "string");
            userSchema.add("u_lastName", "string");
            userSchema.add("u_phone","string");
            userSchema.add("u_email", "string")
                    ;

            Dataset<Row> dataset = spark.
                    readStream().
                    schema(userSchema)
                    .csv("D:\\user\\sdata\\user-2019-10-03_20.csv");


            dataset.writeStream()
            .format("console")
            .option("truncate","false")
            .start();


        }

    }

in this code line userSchema.add("name", "string");在此代码行中userSchema.add("name", "string"); causing the program to terrminate.导致程序终止。 Below is the log trace.下面是日志跟踪。

ANTLR Tool version 4.7 used for code generation does not match the current runtime version 4.5.3ANTLR Runtime version 4.7 used for parser compilation does not match the current runtime version 4.5.3Exception in thread "main" java.lang.ExceptionInInitializerError  at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parse(ParseDriver.scala:84)   at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parseDataType(ParseDriver.scala:39)   at org.apache.spark.sql.types.StructType.add(StructType.scala:213)  at com.cybernetix.sparks.jobs.ReadCSVJob.main(ReadCSVJob.java:45) Caused by: java.lang.UnsupportedOperationException: java.io.InvalidClassException: org.antlr.v4.runtime.atn.ATN; Could not deserialize ATN with UUID 59627784-3be5-417a-b9eb-8131a7286089 (expected aadb8d7e-aeef-4415-ad2b-8204d6cf042e or a legacy UUID).   at org.antlr.v4.runtime.atn.ATNDeserializer.deserialize(ATNDeserializer.java:153)   at org.apache.spark.sql.catalyst.parser.SqlBaseLexer.<clinit>(SqlBaseLexer.java:1175)   ... 4 more Caused by: java.io.InvalidClassException: org.antlr.v4.runtime.atn.ATN; Could not deserialize ATN with UUID 59627784-3be5-417a-b9eb-8131a7286089 (expected aadb8d7e-aeef-4415-ad2b-8204d6cf042e or a legacy UUID).   ... 6 more

I have added ANTLR maven dependency in pom.xml file but still facing the same issue.我在 pom.xml 文件中添加了 ANTLR maven 依赖项,但仍然面临同样的问题。

<!-- https://mvnrepository.com/artifact/org.antlr/antlr4 -->
<dependency>
    <groupId>org.antlr</groupId>
    <artifactId>antlr4</artifactId>
    <version>4.7</version>
</dependency>

I am not sure after adding antlr dependency, why in maven dependency list still it antlr-runtime-4.5.3.jar.添加antlr依赖后我不确定为什么在maven依赖列表中仍然是antlr-runtime-4.5.3.jar。 Have a look to below screen shot.看看下面的屏幕截图。

在此处输入图像描述

Can anyone help me what i am doing wrong here?谁能帮我在这里做错了什么?

Update your artifactId to antlr4-runtime , and try again.将您的 artifactId 更新为antlr4-runtime ,然后重试。 Please clean and build .cleanbuild

dependency should look like below:依赖项应如下所示:

<dependency>
    <groupId>org.antlr</groupId>
    <artifactId>antlr4-runtime</artifactId>
    <version>4.7</version>
</dependency>

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM