简体   繁体   中英

not able to import spark mllib in IntelliJ

I am not able to import spark mllib libraries in Intellij for Spark scala project. I am getting a resolution exception.

Below is my sbt.build

name := "ML_Spark"  

version := "0.1" 

scalaVersion := "2.11.12"

libraryDependencies += "org.apache.spark" %% "spark-core" % "2.2.1"
libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.2.1"
libraryDependencies += "org.apache.spark" %% "spark-mllib" % "2.2.1" % "runtime"

I tried to copy/paste the same build.sbt file you provided and i got the following error :

[error] [/Users/pc/testsbt/build.sbt]:3: ';' expected but string literal found.

Actually, the build.sbt is invalid : intellij error

Having the version and the Scala version in different lines solved the problem for me :

name := "ML_Spark"  

version := "0.1" 
scalaVersion := "2.11.12"

libraryDependencies += "org.apache.spark" %% "spark-core" % "2.2.1"
libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.2.1"
libraryDependencies += "org.apache.spark" %% "spark-mllib" % "2.2.1" % "runtime"

I am not sure that this is the problem you're facing (can you please share the exception you had ?), it might be a problem with the repositories you specified under the .sbt folder in your home directory.

I have met the same problem before. To solve it, I just used the compiled version of mllib instead of the runtime one. Here is my conf:

name := "SparkDemo"

version := "0.1"

scalaVersion := "2.11.12"

// https://mvnrepository.com/artifact/org.apache.spark/spark-core
libraryDependencies += "org.apache.spark" %% "spark-core" % "2.3.0"
// https://mvnrepository.com/artifact/org.apache.spark/spark-mllib
libraryDependencies += "org.apache.spark" %% "spark-mllib" % "2.3.0"

I had a similar issue, but I found a workaround. Namely, you have to add the spark-mllib jar file to your project manually. Indeed, despite my build.sbt file was

name := "example_project"

version := "0.1"
scalaVersion := "2.12.10"

libraryDependencies ++= Seq(
  "org.apache.spark" %% "spark-core" % "3.0.0",
  "org.apache.spark" %% "spark-sql" % "3.0.0",
  "org.apache.spark" %% "spark-mllib" % "3.0.0" % "runtime"
)

I wasn't able to import the spark library with

import org.apache.spark.sql._
import org.apache.spark.sql.types._
import org.apache.spark.ml._

The solution that worked for me was to add the jar file manually. Specifically,

  1. Download the jar file of the ml library you need (eg for spark 3 use https://mvnrepository.com/artifact/org.apache.spark/spark-mllib_2.12/3.0.0 ).

  2. Follow this link to add the jar file to your intelliJ project: Correct way to add external jars (lib/*.jar) to an IntelliJ IDEA project

  3. Add also the mlib-local jar ( https://mvnrepository.com/artifact/org.apache.spark/spark-mllib-local )

If, for some reason, you compile again the build.sbt you need to re-import the jar file again.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM