简体   繁体   English

为什么从Java应用程序连接到Spark Standalone时,“无法调用已停止的SparkContext上的方法”?

[英]Why is “Cannot call methods on a stopped SparkContext” thrown when connecting to Spark Standalone from Java application?

I have downloaded Apache Spark 1.4.1 pre-built for Hadoop 2.6 and later. 我已经下载了为Hadoop 2.6及更高版本预先构建的Apache Spark 1.4.1。 I have two Ubuntu 14.04 machines. 我有两台Ubuntu 14.04机器。 One of them I have set as the Spark master with a single slave and the second machine is running one Spark slave. 其中一个我用一个奴隶设置为Spark master,第二个机器运行一个Spark slave。 When I execute the ./sbin/start-all.sh command the master and the slaves are started successfully. 执行./sbin/start-all.sh命令时,主站和从站成功启动。 After that I ran the sample PI program in the spark-shell setting the --master spark://192.168.0.105:7077 to the Spark master URL displayed in the Spark web UI. 之后,我将spark-shell设置中的示例PI程序--master spark://192.168.0.105:7077到Spark Web UI中显示的Spark主URL。

So far everything works great. 到目前为止一切都很好。

I have created a Java application and I tried to configure it to run Spark jobs when needed. 我创建了一个Java应用程序,并尝试将其配置为在需要时运行Spark作业。 I added the spark dependencies in the pom.xml file. 我在pom.xml文件中添加了spark依赖项。

        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-core_2.11</artifactId>
            <version>1.4.1</version>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-streaming_2.11</artifactId>
            <version>1.4.1</version>
        </dependency>

I have created a SparkConfig : 我创建了一个SparkConfig

private parkConf sparkConfig = new SparkConf(true)
            .setAppName("Spark Worker")
            .setMaster("spark://192.168.0.105:7077");

And I create a SparkContext using the SparkConfig : 我创建一个SparkContext使用SparkConfig

private SparkContext sparkContext = new SparkContext(sparkConfig);

On this step the following error is thrown: 在此步骤中,将引发以下错误:

java.lang.IllegalStateException: Cannot call methods on a stopped SparkContext
    at org.apache.spark.SparkContext.org$apache$spark$SparkContext$$assertNotStopped(SparkContext.scala:103)
    at org.apache.spark.SparkContext.getSchedulingMode(SparkContext.scala:1503)
    at org.apache.spark.SparkContext.postEnvironmentUpdate(SparkContext.scala:2007)
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:543)
    at com.storakle.dataimport.spark.StorakleSparkConfig.getSparkContext(StorakleSparkConfig.java:37)
    at com.storakle.dataimport.reportprocessing.DidNotBuyProductReport.prepareReportData(DidNotBuyProductReport.java:25)
    at com.storakle.dataimport.messagebroker.RabbitMQMessageBroker$1.handleDelivery(RabbitMQMessageBroker.java:56)
    at com.rabbitmq.client.impl.ConsumerDispatcher$5.run(ConsumerDispatcher.java:144)
    at com.rabbitmq.client.impl.ConsumerWorkService$WorkPoolRunnable.run(ConsumerWorkService.java:99)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

If I change the Spark master to local everything works just fine. 如果我将Spark master更改为local一切正常。

private parkConf sparkConfig = new SparkConf(true)
                .setAppName("Spark Worker")
                .setMaster("local");

I am running the Java app on the same machine that hosts the Spark Master. 我在托管Spark Master的同一台机器上运行Java应用程序。

I have no idea why this is happening? 我不知道为什么会这样? Every documentation and example that I've found so far are indicating that the code should work with the Spark Master URL. 到目前为止,我发现的每个文档和示例都表明代码应该与Spark Master URL一起使用。

Any ideas why this is happening and how I can fix it? 任何想法为什么会发生这种情况以及我如何解决它? I have spent a lot of time trying to figure this one out and with no luck so far. 我花了很多时间试图弄清楚这一个,到目前为止没有运气。

I think you use Spark 1.4.1 for Scala 2.10. 我认为你使用Spark 1.4.1 for Scala 2.10。 Therefore, you need spark-core_2.10 and spark-streaming_2.10 instead 2.11 . 因此,您需要spark-core_2.10spark-streaming_2.10而不是2.11 spark-core_2.11 incompatible with Spark built for Scala 2.10. spark-core_2.11与为Scala 2.10构建的Spark不兼容。

For building Spark for Scala 2.11 see: 要为Scala 2.11构建Spark,请参阅:

http://spark.apache.org/docs/latest/building-spark.html#building-for-scala-211 http://spark.apache.org/docs/latest/building-spark.html#building-for-scala-211

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM