在集群模式下將Spark從eclipse部署到YARN時出錯

Question

我正在嘗試將Apache Spark Pi示例從Eclipse部署到Hadoop YARN。 我正在使用3個帶有linux的虛擬機運行自己的集群。 集群中的Hadoop版本為2.7.2，Spark為1.6.0，並為Hadoop 2.6.0及更高版本進行了預構建。 我能夠從node運行Pi示例，但是當我想在Windows上從eclipse運行Java Pi示例（紗線群集模式）時，出現如下所示的錯誤。 我發現了幾個帶有此錯誤的線程，但是其中大多數線程是針對cloudera或hortonwork的，帶有一些額外的變量，或者沒有解決我的問題。 我也嘗試了YARN客戶端模式，結果相同。 有人可以幫我嗎，請

Eclipse控制台輸出：

16/02/23 11:21:51 INFO Client: Requesting a new application from cluster with 1 NodeManagers
16/02/23 11:21:51 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container)
16/02/23 11:21:51 INFO Client: Will allocate AM container, with 1384 MB memory including 384 MB overhead
16/02/23 11:21:51 INFO Client: Setting up container launch context for our AM
16/02/23 11:21:51 INFO Client: Setting up the launch environment for our AM container
16/02/23 11:21:51 INFO Client: Preparing resources for our AM container
16/02/23 11:21:53 WARN : Your hostname, uherpc resolves to a loopback/non-reachable address: 172.25.32.214, but we couldn't find any external IP address!
16/02/23 11:21:53 INFO Client: Uploading resource file:/C:/Users/xuherv00/.gradle/caches/modules-2/files-2.1/org.apache.spark/spark-yarn_2.10/1.6.0/ace7b1f6f0c33b48e0323b7b0e7dd8ab458c14a4/spark-yarn_2.10-1.6.0.jar -> hdfs://sparkmaster:9000/user/hduser/.sparkStaging/application_1456222391080_0002/spark-yarn_2.10-1.6.0.jar
16/02/23 11:21:54 INFO Client: Uploading resource file:/C:/Users/xuherv00/workspace/rapidminer5/RapidMiner_Extension_Streaming/lib/spark-examples-1.6.0-hadoop2.6.0.jar -> hdfs://sparkmaster:9000/user/hduser/.sparkStaging/application_1456222391080_0002/spark-examples-1.6.0-hadoop2.6.0.jar
16/02/23 11:22:02 INFO Client: Uploading resource file:/C:/Users/xuherv00/AppData/Local/Temp/spark-266eade5-5049-4b13-9f75-edb5200e3df1/__spark_conf__6296221515875913107.zip -> hdfs://sparkmaster:9000/user/hduser/.sparkStaging/application_1456222391080_0002/__spark_conf__6296221515875913107.zip
16/02/23 11:22:02 INFO SecurityManager: Changing view acls to: xuherv00,hduser
16/02/23 11:22:02 INFO SecurityManager: Changing modify acls to: xuherv00,hduser
16/02/23 11:22:02 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(xuherv00, hduser); users with modify permissions: Set(xuherv00, hduser)
16/02/23 11:22:02 INFO Client: Submitting application 2 to ResourceManager
16/02/23 11:22:02 INFO YarnClientImpl: Submitted application application_1456222391080_0002
16/02/23 11:22:03 INFO Client: Application report for application_1456222391080_0002 (state: ACCEPTED)
16/02/23 11:22:03 INFO Client: 
     client token: N/A
     diagnostics: N/A
     ApplicationMaster host: N/A
     ApplicationMaster RPC port: -1
     queue: default
     start time: 1456222434780
     final status: UNDEFINED
     tracking URL: http://sparkmaster:8088/proxy/application_1456222391080_0002/
     user: hduser
16/02/23 11:22:04 INFO Client: Application report for application_1456222391080_0002 (state: ACCEPTED)
16/02/23 11:22:05 INFO Client: Application report for application_1456222391080_0002 (state: ACCEPTED)
16/02/23 11:22:06 INFO Client: Application report for application_1456222391080_0002 (state: ACCEPTED)
16/02/23 11:22:07 INFO Client: Application report for application_1456222391080_0002 (state: ACCEPTED)
16/02/23 11:22:08 INFO Client: Application report for application_1456222391080_0002 (state: ACCEPTED)
16/02/23 11:22:09 INFO Client: Application report for application_1456222391080_0002 (state: ACCEPTED)
16/02/23 11:22:10 INFO Client: Application report for application_1456222391080_0002 (state: FAILED)
16/02/23 11:22:10 INFO Client: 
     client token: N/A
     diagnostics: Application application_1456222391080_0002 failed 2 times due to AM Container for appattempt_1456222391080_0002_000002 exited with  exitCode: 1
For more detailed output, check application tracking page:http://sparkmaster:8088/cluster/app/application_1456222391080_0002Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_1456222391080_0002_02_000001
Exit code: 1
Stack trace: ExitCodeException exitCode=1: 
    at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
    at org.apache.hadoop.util.Shell.run(Shell.java:456)
    at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)
    at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)


Container exited with a non-zero exit code 1
Failing this attempt. Failing the application.
     ApplicationMaster host: N/A
     ApplicationMaster RPC port: -1
     queue: default
     start time: 1456222434780
     final status: FAILED
     tracking URL: http://sparkmaster:8088/cluster/app/application_1456222391080_0002
     user: hduser
16/02/23 11:22:10 INFO Client: Deleting staging directory .sparkStaging/application_1456222391080_0002
16/02/23 11:22:10 INFO ShutdownHookManager: Shutdown hook called
16/02/23 11:22:10 INFO ShutdownHookManager: Deleting directory C:\Users\xuherv00\AppData\Local\Temp\spark-266eade5-5049-4b13-9f75-edb5200e3df1

Hadoop應用程序日志：

User:   hduser
Name:   testApp
Application Type:   SPARK
Application Tags:   
YarnApplicationState:   FAILED
FinalStatus Reported by AM:     FAILED
Started:    Tue Feb 23 11:13:54 +0100 2016
Elapsed:    8sec
Tracking URL:   History
Diagnostics:    
Application application_1456222391080_0002 failed 2 times due to AM Container for appattempt_1456222391080_0002_000002 exited with exitCode: 1
For more detailed output, check application tracking page:http://sparkmaster:8088/cluster/app/application_1456222391080_0002Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_1456222391080_0002_02_000001
Exit code: 1
Stack trace: ExitCodeException exitCode=1:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
at org.apache.hadoop.util.Shell.run(Shell.java:456)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 1
Failing this attempt. Failing the application.

container_1456222391080_0002_01_000001 stderr的日志：

Error: Could not find or load main class org.apache.spark.deploy.yarn.ApplicationMaster

Gradle依賴項：

    //hadoop
compile 'org.apache.hadoop:hadoop-common:2.7.2'
compile 'org.apache.hadoop:hadoop-hdfs:2.7.2'
compile 'org.apache.hadoop:hadoop-client:2.7.2'
compile 'org.apache.hadoop:hadoop-mapreduce-client-core:2.7.2'
compile 'org.apache.hadoop:hadoop-yarn-common:2.7.2'
compile 'org.apache.hadoop:hadoop-yarn-api:2.7.2'

//spark
compile 'org.apache.spark:spark-core_2.10:1.6.0'
compile 'org.apache.spark:spark-sql_2.10:1.6.0'
compile 'org.apache.spark:spark-streaming_2.10:1.6.0'
compile 'org.apache.spark:spark-catalyst_2.10:1.6.0'
compile 'org.apache.spark:spark-yarn_2.10:1.6.0'
compile 'org.apache.spark:spark-network-shuffle_2.10:1.6.0'
compile 'org.apache.spark:spark-network-common_2.10:1.6.0'
compile 'org.apache.spark:spark-network-yarn_2.10:1.6.0'

Java類：

package mypackage;

import java.io.File;
import java.lang.reflect.Method;
import java.net.URI;
import java.net.URL;
import java.net.URLClassLoader;

import org.apache.hadoop.conf.Configuration;
import org.apache.spark.SparkConf;
import org.apache.spark.deploy.yarn.Client;
import org.apache.spark.deploy.yarn.ClientArguments;
import org.junit.Test;

public class ExampleApp {
    private String appName = "testApp";
//  private String mode = "yarn-client";
    private String mode = "yarn-cluster";
    private File appJar = new File("lib/spark-examples-1.6.0-hadoop2.6.0.jar");
    private URI appJarUri = appJar.toURI();
    private String hadoopPath = "E:\\store\\hadoop";

    @Test
    public void deployPiToYARN() {
        String[] args = new String[] {
                // the name of your application
                "--name", appName,

                // memory for driver (optional)
                "--driver-memory", "1000M",

                // path to your application's JAR file
                // required in yarn-cluster mode
                "--jar", appJarUri.toString(),

                // name of your application's main class (required)
                "--class", "org.apache.spark.examples.SparkPi",

                // comma separated list of local jars that want
                // SparkContext.addJar to work with
//              "--addJars",
//              "lib/spark-assembly-1.6.0-hadoop2.6.0.jar",

                // argument 1 to Spark program
                 "--arg",
                 "10",
        };

        System.setProperty("hadoop.home.dir", hadoopPath);

        System.setProperty("HADOOP_USER_NAME", "hduser");
        try {
            addHadoopConfToClassPath(hadoopPath);
        } catch (Exception e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
        // create a Hadoop Configuration object
        Configuration config = new Configuration();
        config.set("yarn.resourcemanager.address", "172.25.32.192:8050");

        // identify that you will be using Spark as YARN mode
        System.setProperty("SPARK_YARN_MODE", "true");

        // create an instance of SparkConf object
        SparkConf sparkConf = new SparkConf().setAppName(appName);
        sparkConf = sparkConf.setMaster(mode);
        sparkConf = sparkConf.set("spark.executor.memory","1g");

        // create ClientArguments, which will be passed to Client
        ClientArguments cArgs = new ClientArguments(args, sparkConf);
        // create an instance of yarn Client client
        Client client = new Client(cArgs, config, sparkConf);

//      client.submitApplication();
        // submit Spark job to YARN
        client.run();
    }

    private void addHadoopConfToClassPath(String path) throws Exception {
        File f = new File(path);
        URL u = f.toURI().toURL();
        URLClassLoader urlClassLoader = (URLClassLoader) ClassLoader.getSystemClassLoader();
        Class<URLClassLoader> urlClass = URLClassLoader.class;
        Method method = urlClass.getDeclaredMethod("addURL", new Class[]{URL.class});
        method.setAccessible(true);
        method.invoke(urlClassLoader, new Object[]{u});
    }
}

核心-site.xml中：

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>



<configuration>
    <property>
        <name>fs.default.name</name>
        <value>hdfs://sparkmaster:9000</value>
    </property>
</configuration>

HDFS-site.xml中：

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>


<configuration>
        <property>
                <name>dfs.replication</name>
                <value>3</value>
        </property>
        <property>
                <name>dfs.namenode.name.dir</name>
        <value>file:/usr/local/hadoop_tmp/hdfs/namenode</value>
        </property>
        <property>
                <name>dfs.datanode.data.dir</name>
                <value>file:/usr/local/hadoop_tmp/hdfs/datanode</value>
        </property>
</configuration>

紗的site.xml：

<configuration>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    <property>
        <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>
  <property>
    <name>yarn.resourcemanager.hostname</name>
    <value>sparkmaster</value>
  </property>

  <property>
    <name>yarn.resourcemanager.resource-tracker.address</name>
    <value>sparkmaster:8025</value>
  </property>

  <property>
    <name>yarn.resourcemanager.scheduler.address</name>
    <value>sparkmaster:8035</value>
  </property>
  <property>
    <name>yarn.resourcemanager.address</name>
    <value>sparkmaster:8050</value>
    </property>
</configuration>

Answer 1

您不應該從eclipse上在紗線上啟動火花，而應使用SparkSubmit。 盡管您可以從Eclipse使用本地模式。

SparkSubmit為您做很多事情，包括將依賴項（例如火花罐）上傳到紗線簇，執行者將引用它們。 這就是為什么您遇到以上錯誤的原因。

在集群模式下將Spark從eclipse部署到YARN時出錯

問題描述

1 個解決方案

解決方案1
1 2016-02-23 11:48:02

在集群模式下將Spark從eclipse部署到YARN時出錯

問題描述

1 個解決方案

解決方案1 1 2016-02-23 11:48:02

解決方案1
1 2016-02-23 11:48:02