使用java.lang.NoSuchMethodError火花读取HBase：org.apache.hadoop.mapreduce.InputSplit.getLocationInfo错误

Question

I want to use scala read Hbase by Spark, but I got error: 我想使用Scala通过Spark读取Hbase，但出现错误：

Exception in thread "dag-scheduler-event-loop" java.lang.NoSuchMethodError: org.apache.hadoop.mapreduce.InputSplit.getLocationInfo()[Lorg/apache/hadoop/mapred/SplitLocationInfo;

But I already add the dependencies, this problem bothers me. 但是我已经添加了依赖项，这个问题困扰着我。 My environment is as follow: 我的环境如下：

scala: 2.11.12 标量：2.11.12
Spark: 2.3.1 火花：2.3.1
HBase: maybe 2.1.0(I don't know) HBase：也许是2.1.0（我不知道）
Hadoop: 2.7.2.4 的Hadoop：2.7.2.4

And my build.sbt is： 我的build.sbt是：

libraryDependencies ++= Seq(
    "org.apache.spark" % "spark-core_2.11" % "2.3.1",
    "org.apache.spark" % "spark-sql_2.11" % "2.3.1",
    "org.apache.spark" % "spark-streaming_2.11" % "2.3.1",
    "org.apache.spark" % "spark-streaming-kafka-0-10_2.11" % "2.3.1",
    "org.apache.spark" % "spark-yarn_2.11" % "2.3.1",
    "org.apache.hadoop" % "hadoop-core" % "2.6.0-mr1-cdh5.15.1",
    "org.apache.hadoop" % "hadoop-common" % "2.7.2",
    "org.apache.hadoop" % "hadoop-client" % "2.7.2",
    "org.apache.hadoop" % "hadoop-mapred" % "0.22.0",
    "org.apache.hadoop" % "hadoop-nfs" % "2.7.2",
    "org.apache.hadoop" % "hadoop-hdfs" % "2.7.2",
    "org.apache.hadoop" % "hadoop-hdfs-nfs" % "2.7.2",
    "org.apache.hadoop" % "hadoop-mapreduce-client-core" % "2.7.2",
    "org.apache.hadoop" % "hadoop-mapreduce" % "2.7.2",
    "org.apache.hadoop" % "hadoop-mapreduce-client" % "2.7.2",
    "org.apache.hadoop" % "hadoop-mapreduce-client-common" % "2.7.2",
    "org.apache.hbase" % "hbase" % "2.1.0",
    "org.apache.hbase" % "hbase-server" % "2.1.0",
    "org.apache.hbase" % "hbase-common" % "2.1.0",
    "org.apache.hbase" % "hbase-client" % "2.1.0",
    "org.apache.hbase" % "hbase-protocol" % "2.1.0",
    "org.apache.hbase" % "hbase-metrics" % "2.1.0",
    "org.apache.hbase" % "hbase-metrics-api" % "2.1.0",
    "org.apache.hbase" % "hbase-mapreduce" % "2.1.0",
    "org.apache.hbase" % "hbase-zookeeper" % "2.1.0",
    "org.apache.hbase" % "hbase-hadoop-compat" % "2.1.0",
    "org.apache.hbase" % "hbase-hadoop2-compat" % "2.1.0",
    "org.apache.hbase" % "hbase-spark" % "2.1.0-cdh6.1.0"
)

I really don't know where I'm wrong, If I add wrong dependence or I need to add some new dependence, please tell me where can I download it,like: resolvers += "Apache HBase" at "https://repository.apache.org/content/repositories/releases" 我真的不知道哪里错了，如果添加了错误的依赖项，或者需要添加一些新的依赖项，请告诉我在哪里可以下载它，例如： resolvers += "Apache HBase" at "https://repository.apache.org/content/repositories/releases"

Please help me, Thanks! 请帮助我，谢谢！

Answer 1

Can I get more details about how you are running the spark job? 我可以获取有关您如何执行火花作业的更多详细信息吗？ If you are using custom distribution such as Cloudera or Horton works, you may have to use their libraries to compile and spark-submit will use the distribution installed classpath to submit the job to cluster. 如果您正在使用Cloudera或Horton Works等自定义分发，则可能必须使用它们的库进行编译，并且spark-submit将使用分发安装的classpath将作业提交给集群。

To get started, please add % provided to the library in sbt file so that it will use the particular library from the classpath of spark installation. 首先，请将% provided的% provided添加到sbt文件中的库中，以便它将使用spark安装类路径中的特定库。

Answer 2

You need to fix the versions of these to match the version of Hadoop you're running, otherwise you can expect to get classpath/method issues. 您需要修复这些版本以匹配您正在运行的Hadoop的版本，否则可能会遇到类路径/方法问题。 Specifically, your error comes from the mapreduce package 具体来说，您的错误来自mapreduce包

"org.apache.hadoop" % "hadoop-core" % "2.6.0-mr1-cdh5.15.1",
"org.apache.hadoop" % "hadoop-mapred" % "0.22.0",

Spark already includes most of the Hadoop ones itself, so it's not clear why you're specifying them all yourself, but at least put % "provided" on some of them Spark 已经包含了大多数Hadoop本身，因此尚不清楚为什么要自己全部指定它们，但至少在其中一些上% "provided"了

And for hbase-spark , I doubt that you want a cdh6 dependency because CDH 6 is based on Hadoop 3 libraries, not 2.7.2 对于hbase-spark ，我怀疑您是否需要cdh6依赖性，因为CDH 6基于Hadoop 3库，而不是2.7.2。

使用java.lang.NoSuchMethodError火花读取HBase：org.apache.hadoop.mapreduce.InputSplit.getLocationInfo错误

问题描述

2 个解决方案

解决方案1
1 2018-11-03 02:56:19

解决方案2
1 已采纳 2018-11-03 04:46:30

使用java.lang.NoSuchMethodError火花读取HBase：org.apache.hadoop.mapreduce.InputSplit.getLocationInfo错误

问题描述

2 个解决方案

解决方案1 1 2018-11-03 02:56:19

解决方案2 1 已采纳 2018-11-03 04:46:30

解决方案1
1 2018-11-03 02:56:19

解决方案2
1 已采纳 2018-11-03 04:46:30