简体   繁体   中英

java apache spark mllib

I have started learning mllib apache spark in java. I am following spark 2.1.1 documents from official website. I have spark-2.1.1-bin-hadoop2.7 installed in my ubuntu 14.04 lts. I am trying to run this code.

public class JavaLogisticRegressionWithElasticNetExample {
public static void main(String[] args) {
    SparkSession spark = SparkSession.builder().appName("JavaLogisticRegressionWithElasticNetExample") .master("local[*]").getOrCreate();
  // $example on$
    // Load training data
    Dataset<Row> training = spark.read().format("libsvm")
            .load("data/mllib/sample_libsvm_data.txt");

    LogisticRegression lr = new LogisticRegression()
            .setMaxIter(10)
            .setRegParam(0.3)
            .setElasticNetParam(0.8);

    // Fit the model
    LogisticRegressionModel lrModel = lr.fit(training);

    // Print the coefficients and intercept for logistic regression
    System.out.println("Coefficients: "
            + lrModel.coefficients() + " Intercept: " + lrModel.intercept());

    // We can also use the multinomial family for binary classification
    LogisticRegression mlr = new LogisticRegression()
            .setMaxIter(10)
            .setRegParam(0.3)
            .setElasticNetParam(0.8)
            .setFamily("multinomial");

    // Fit the model
    LogisticRegressionModel mlrModel = mlr.fit(training);

    // Print the coefficients and intercepts for logistic regression with multinomial family
    System.out.println("Multinomial coefficients: " + lrModel.coefficientMatrix()
            + "\nMultinomial intercepts: " + mlrModel.interceptVector());
    // $example off$

    spark.stop();
}

}

I have installed spark-2.1.1-bin-hadoop2.7 in my system. I have pom.xml files are

<dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-sql_2.11</artifactId>
        <version>2.1.1</version>
        <scope>provided</scope>
    </dependency>
<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-mllib_2.10</artifactId>
    <version>2.1.1</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-mllib-local_2.10 -->
<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-mllib-local_2.10</artifactId>
    <version>2.1.1</version>
</dependency>

but i am getting this exception

17/09/08 16:42:19 INFO SparkEnv: Registering OutputCommitCoordinator Exception in thread "main" java.lang.NoSuchMethodError: scala.Predef$.$scope()Lscala/xml/TopScope$; at org.apache.spark.ui.jobs.AllJobsPage.(AllJobsPage.scala:39) at org.apache.spark.ui.jobs.JobsTab.(JobsTab.scala:38) at org.apache.spark.ui.SparkUI.initialize(SparkUI.scala:65) at org.apache.spark.ui.SparkUI.(SparkUI.scala:82) at org.apache.spark.ui.SparkUI$.create(SparkUI.scala:220) at org.apache.spark.ui.SparkUI$.createLiveUI(SparkUI.scala:162) at org.apache.spark.SparkContext.(SparkContext.scala:452) at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2320) at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:868) at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:860) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:860) at JavaLogisticRegressionWithElasticNetExample.main(JavaLogisticRegressionWithElasticNetExample.java:12) 17/09/08 16:42:19 INFO DiskBlockManager: Shutdown hook called 17/09/08 16:42:19 INFO ShutdownHookManager: Shutdown hook called 17/09/08 16:42:19 INFO ShutdownHookManager: Deleting directory /tmp/spark-8460a189-3039-47ec-8d75-9e0ca8b4ee5d 17/09/08 16:42:19 INFO ShutdownHookManager: Deleting directory /tmp/spark-8460a189-3039-47ec-8d75-9e0ca8b4ee5d/userFiles-9b6994eb-1376-47a3-929e-e415e1fdb0c0

This kind of errors happens when you are using different versions of scala in the same program. And indeed, in your dependencies (in your pom.xml ), you have some libraries with scala 2.10 and others with scala 2.11.

Use spark-sql_2.10 instead of spark-sql_2.11 and you will be fine (or change the mllib versions to 2.11).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM