简体   繁体   中英

Runtime Exception java.lang.NoSuchMethodError: com.google.common.base.Optional.toJavaUtil()L with Spark-BigQuery connector

Currently I'm trying to connect to BigQuery from Spark. I'm have built fat jar file using sbt assembly plugin and trying to launch the job in local mode using spark-submit . I'm observing java.lang.NoSuchMethodError: com.google.common.base.Optional.toJavaUtil()Ljava/util/Optional; exception as soon as Spark job is launched.

Below is the Exception Trace,

Exception in thread "main" java.lang.NoSuchMethodError: com.google.common.base.Optional.toJavaUtil()Ljava/util/Optional;
    at com.google.cloud.spark.bigquery.SparkBigQueryConfig.getOption(SparkBigQueryConfig.java:265)
    at com.google.cloud.spark.bigquery.SparkBigQueryConfig.getOption(SparkBigQueryConfig.java:256)
    at com.google.cloud.spark.bigquery.SparkBigQueryConfig.lambda$getOptionFromMultipleParams$7(SparkBigQueryConfig.java:273)
    at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
    at java.util.Spliterators$IteratorSpliterator.tryAdvance(Spliterators.java:1812)
    at java.util.stream.ReferencePipeline.forEachWithCancel(ReferencePipeline.java:126)
    at java.util.stream.AbstractPipeline.copyIntoWithCancel(AbstractPipeline.java:499)
    at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:486)
    at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
    at java.util.stream.FindOps$FindOp.evaluateSequential(FindOps.java:152)
    at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
    at java.util.stream.ReferencePipeline.findFirst(ReferencePipeline.java:464)
    at com.google.cloud.spark.bigquery.SparkBigQueryConfig.getOptionFromMultipleParams(SparkBigQueryConfig.java:275)
    at com.google.cloud.spark.bigquery.SparkBigQueryConfig.from(SparkBigQueryConfig.java:119)
    at com.google.cloud.spark.bigquery.BigQueryRelationProvider.createSparkBigQueryConfig(BigQueryRelationProvider.scala:133)
    at com.google.cloud.spark.bigquery.BigQueryRelationProvider.createRelationInternal(BigQueryRelationProvider.scala:71)
    at com.google.cloud.spark.bigquery.BigQueryRelationProvider.createRelation(BigQueryRelationProvider.scala:45)
    at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:340)
    at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:239)
    at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:227)
    at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:174)
    at com.bigquery.OwnDataSetReader$.delayedEndpoint$com$$bigquery$OwnDataSetReader$1(OwnDataSetReader.scala:18)
    at com.bigquery.OwnDataSetReader$delayedInit$body.apply(OwnDataSetReader.scala:6)
    at scala.Function0$class.apply$mcV$sp(Function0.scala:34)
    at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
    at scala.App$$anonfun$main$1.apply(App.scala:76)
    at scala.App$$anonfun$main$1.apply(App.scala:76)
    at scala.collection.immutable.List.foreach(List.scala:381)
    at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:35)
    at scala.App$class.main(App.scala:76)
    at com..bigquery.OwnDataSetReader$.main(OwnDataSetReader.scala:6)
    at com..bigquery.OwnDataSetReader.main(OwnDataSetReader.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:894)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

After doing some research on the exception, I found that this exception can happen due to multiple version of guava library. I made sure that there are no such conflicts in the final build jar, I also verified it by decompiling my jar file. No conflicts were observed, but issue still persists:(. Below is the build.sbt snippet,

name := "bigquer-connector"

version := "0.1"

scalaVersion := "2.11.8"
test in assembly := {}

assemblyJarName in assembly := "BigQueryConnector.jar"

assemblyMergeStrategy in assembly := {
  case x if x.startsWith("META-INF") => MergeStrategy.discard
  case x =>
    val oldStrategy = (assemblyMergeStrategy in assembly).value
    oldStrategy(x)

}

libraryDependencies += ("com.google.cloud.spark" %% "spark-bigquery" % "0.18.0")
  .exclude("com.google.guava", "guava")
  .exclude("org.glassfish.jersey.bundles.repackaged", "jersey-guava")

libraryDependencies += "com.google.guava" % "guava" % "30.0-jre"

libraryDependencies += ("org.apache.spark" % "spark-core_2.11" % "2.3.1")
  .exclude("com.google.guava", "guava")
  .exclude("org.glassfish.jersey.bundles.repackaged", "jersey-guava")


libraryDependencies += ("org.apache.spark" % "spark-sql_2.11" % "2.3.1")
  .exclude("com.google.guava", "guava")
  .exclude("org.glassfish.jersey.bundles.repackaged", "jersey-guava")

Below is the Main class,

object OwnDataSetReader extends App {

  val session = SparkSession.builder()
    .appName("big-query-connector")
    .config(getConf)
    .getOrCreate()

  session.read
    .format("com.google.cloud.spark.bigquery")
    .option("viewsEnabled", true)
    .option("parentProject", "my_gcp_project")
    .option("credentialsFile", "<path to private json file>")
    .load("my_gcp_data_set.my_gcp_view")
    .show(2)

  private def getConf : SparkConf = {
    val sparkConf = new SparkConf
    sparkConf.setAppName("biq-query-connector")
    sparkConf.setMaster("local[*]")

    sparkConf
  }
}

Command used for launching the Spark in my local terminal: spark-submit --deploy-mode client --class com.bigquery.OwnDataSetReader BigQueryConnector.jar . I'm using spark version 2.3.x on my local machine

I was able to fix the issue. It was with merge strategy in my build.sbt file.

assemblyMergeStrategy in assembly := {
  case x if x.startsWith("META-INF") => MergeStrategy.discard
  case x =>
    val oldStrategy = (assemblyMergeStrategy in assembly).value
    oldStrategy(x)

}

I was discarding the files in META-INF folder. The config files inside META-INF folder of spark-bigquery connector are used during library bootstrap. So, instead of discarding, changing the strategy like below worked for me.

case PathList("META-INF", xs @ _*) =>
    (xs map {_.toLowerCase}) match {
      case ("manifest.mf" :: Nil) | ("index.list" :: Nil) | ("dependencies" :: Nil) | ("license" :: Nil) | ("licence.txt" :: Nil) | ("notice.txt" :: Nil) | ("notice" :: Nil)=>
        MergeStrategy.discard
      case ps @ (x :: xs) if ps.last.endsWith(".sf") || ps.last.endsWith(".dsa") || ps.contains("license") || ps.contains("notice") =>
        MergeStrategy.discard
      case "plexus" :: xs =>
        MergeStrategy.discard
      case "services" :: xs =>
        MergeStrategy.filterDistinctLines
      case _ => MergeStrategy.last
    }

Probably it is because of version mismatch among dependent libraries of com.google.cloud.spark:spark-bigquery_2.11:0.18.1. It was resolved for me using com.google.cloud.spark:spark-bigquery-with-dependencies_2.11:0.18.1 which brings in all the dependent libs.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question java.lang.NoSuchMethodError: 'boolean com.google.api.client.http.HttpTransport.isMtls() java.lang.NoSuchMethodError: 'com.google.api.gax.rpc.StubSettings$Builder com.google.cloud.pubsub.v1.stub.SubscriberStubSettings$Builder.setBackground Twilio | Wso2 : ERROR {NativeWorkerPool} - Uncaught exception java.lang.NoSuchMethodError Android API 24 < Crashes with java.lang.NoClassDefFoundError: com.google.common.base.CharMatcher Why Spark-BigQuery creates extra tables in the dataset java.lang.NoSuchMethodError: 'com.microsoft.aad.msal4j.SilentParameters$SilentParametersBuilder using azure sdk for java service bus : java.lang.NoSuchMethodError: com.amazonaws.services.s3.transfer.TransferManager.<init>(S3;Ljava/util/concurrent/ThreadPoolExecutor;)V java.lang.NoSuchMethodError: No static method registerDefaultInstance with Firebase Performance and Espresso Instrumented Tests java.lang.NoSuchMethodError no non-static method with name='getStatusCode' signature='()I' in class Ljava.lang.Object; AWS EMR cluster with Flink does not run any Jar, instead gives java.lang.NoSuchMethodError
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM