简体   繁体   English

在同一程序中处理2个版本的hadoop时的Hadoop jar冲突问题

[英]Hadoop jar conflict issue when dealing with 2 versions of hadoop in same program

Below is the current Hadoop incompatibility issue we are running into. 以下是我们遇到的当前Hadoop不兼容问题。

USE-CASE 用例

We are reading/scanning from HBASE(Version 0.96.1.2.0.6.1-101-hadoop2) running on New Hadoop (Version 2.2.0.2.0.6.0-101 [Hortonworks] ) and writing to Old Hadoop(Version 0.20.2+320 [Cloudera]) using a JAVA program. 我们正在从在新Hadoop(版本2.2.0.2.0.6.0-101 [Hortonworks])上运行的HBASE(0.96.1.2.0.6.1-101-hadoop2)读取/扫描,并写入旧Hadoop(0.20.2版) +320 [Cloudera])使用JAVA程序。 However we are getting exception due to incompatibility between the 2 Hadoop Versions. 但是,由于两个Hadoop版本之间的不兼容,我们遇到了例外。

The below snippet throws an exception: 以下代码段引发异常:

private HbaseConfigFactory(String clusterUri, String hbaseRootdir) throws Exception {
    factoryImpl = HBaseConfiguration.create();
    factoryImpl.clear();

    factoryImpl.set("hbase.zookeeper.quorum", clusterUri);
    factoryImpl.set("zookeeper.znode.parent", hbaseRootdir);

    // set the zookeeper port
    String[] eles = clusterUri.split(":");
    if (eles.length > 1) {
        factoryImpl.set("hbase.zookeeper.property.clientPort", eles[1]);
    }

    try {
    //THE BELOW CODE CAUSE THE EXCEPTION
          HBaseAdmin.checkHBaseAvailable(factoryImpl);

    } catch (Exception e) {
        String message = String.format("HBase is currently unavailable: %s, %s",
                e.getMessage(), e);
        logger.error(message);

        throw new Exception(e);
    }

}

Bellow is the exception: 波纹管是一个例外:

java.lang.Exception: java.lang.IllegalArgumentException: Can't find method getCurrentUser in org.apache.hadoop.security.UserGroupInformation! java.lang.Exception:java.lang.IllegalArgumentException:在org.apache.hadoop.security.UserGroupInformation中找不到方法getCurrentUser! at com.shopping.writetold.HbaseConfigFactory.(HbaseConfigFactory.java:36) at com.shopping.writetold.HbaseConfigFactory.getInstance(HbaseConfigFactory.java:48) at com.shopping.writetold.WriteToHDFS.readDeals(WriteToHDFS.java:63) at com.shopping.writetold.WriteToHDFS.main(WriteToHDFS.java:50) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120) Caused by: java.lang.IllegalArgumentException: Can't find method getCurrentUser in org.apache.hadoop.security.UserGroupInformation! 在com.shopping.writetold.HbaseConfigFactory。(HbaseConfigFactory.java:36)在com.shopping.writetold.HbaseConfigFactory.getInstance(HbaseConfigFactory.java:48)在com.shopping.writetold.WriteToHDFS.readDeals(WriteToHDFS.java:63)在com.shopping.writetold.WriteToHDFS.main(WriteToHDFS.java:50)在sun.reflect.NativeMethodAccessorImpl.invoke0(本机方法)在sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)在sun.reflect.DelegatingMethodAccessorImpl .invoke(DelegatingMethodAccessorImpl.java:43),位于com.intellij.rt.execution.application.AppMain.main(AppMain.java:120)处的java.lang.reflect.Method.invoke(Method.java:601) java.lang.IllegalArgumentException:在org.apache.hadoop.security.UserGroupInformation中找不到方法getCurrentUser! at org.apache.hadoop.hbase.util.Methods.call(Methods.java:45) at org.apache.hadoop.hbase.security.User.call(User.java:414) at org.apache.hadoop.hbase.security.User.callStatic(User.java:404) at org.apache.hadoop.hbase.security.User.access$200(User.java:48) at org.apache.hadoop.hbase.security.User$SecureHadoopUser.(User.java:221) at org.apache.hadoop.hbase.security.User$SecureHadoopUser.(User.java:216) at org.apache.hadoop.hbase.security.User.getCurrent(User.java:139) at org.apache.hadoop.hbase.client.HConnectionKey.(HConnectionKey.java:67) at org.apache.hadoop.hbase.client.HConnectionManager.getConnection(HConnectionManager.java:240) at org.apache.hadoop.hbase.client.HBaseAdmin.checkHBaseAvailable(HBaseAdmin.java:2321) at com.shopping.writetold.HbaseConfigFactory.(HbaseConfigFactory.java:29) ... 8 more Caused by: java.lang.NoSuchMethodException: org.apache.hadoop.security.UserGroupInformation.getCurrentUser() at java.lang.Class.getMethod(Class.java:1624) at org.apache.hadoop.hbase.util. 在org.apache.hadoop.hbase.security.User.call(User.java:414)在org.apache.hadoop.hbase.security.Method.call(Methods.java:45)在org.apache.hadoop.hbase org.apache.hadoop.hbase.security.User.access $ 200(User.java:48)处的.security.User.callStatic(User.java:404)org.apache.hadoop.hbase.security.User $ SecureHadoopUser。 (User.java:221)位于org.apache.hadoop.hbase.security.User $ SecureHadoopUser。(User.java:216)位于org.apache.hadoop.hbase.security.User.getCurrent(User.java:139)在org.apache.hadoop.hbase.client.HConnectionManager.getConnection(HConnectionManager.java:240)在org.apache.hadoop.hbase.client.HConnectionManager.HConnectionKey。(HConnectionKey.java:67)在org.apache.hadoop.hbase在com.shopping.writetold.HbaseConfigFactory。(HbaseConfigFactory.java:29)处的client.HBaseAdmin.checkHBaseAvailable(HBaseAdmin.java:2321)... 8个以上原因:java.lang.NoSuchMethodException:org.apache.hadoop.security。 org.apache.hadoop.hbase.util中的java.lang.Class.getMethod(Class.java:1624)处的UserGroupInformation.getCurrentUser()。 Methods.call(Methods.java:38) ... 18 more Methods.call(Methods.java:38)...还有18个

Maven dependency Entries: Maven依赖项:

  <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-core</artifactId>
        <version>0.20.2</version>
    </dependency>

    <dependency>
        <groupId>org.apache.hbase</groupId>
        <artifactId>hbase-client</artifactId>
        <version>0.96.0-hadoop2</version>
    </dependency>

Jar details Maven: org.apache.hadoop:hadoop-common:2.1.0-beta hadoop-common-2.1.0-beta.jar jar详细信息Maven:org.apache.hadoop:hadoop-common:2.1.0-beta hadoop-common-2.1.0-beta.jar

Method signature in class file UserGroupInformation public static synchronized org.apache.hadoop.security.UserGroupInformation getCurrentUser() throws java.io.IOException 类文件UserGroupInformation中的方法签名公共静态同步org.apache.hadoop.security.UserGroupInformation getCurrentUser()引发java.io.IOException

Jar details Maven: org.apache.hadoop:hadoop-core:0.20.2 hadoop-core-0.20.2.jar jar详细信息Maven:org.apache.hadoop:hadoop-core:0.20.2 hadoop-core-0.20.2.jar

Method signature in class file UserGroupInformation static javax.security.auth.Subject getCurrentUser() 类文件UserGroupInformation中的方法签名static javax.security.auth.Subject getCurrentUser()

Both have the same name space which is: package org.apache.hadoop.security; 两者具有相同的名称空间,即:org.apache.hadoop.security;

When I have separate program to read from hbase and write to cloudera HDFS with only their respective jars, they work fine. 当我有单独的程序可以从hbase读取并仅使用各自的jar写入cloudera HDFS时,它们可以正常工作。

Is there any solution for handling the above incompatibility in single program. 在单个程序中是否有解决上述不兼容性的解决方案?

Thanks Sagar B 谢谢Sagar B

DISCLAIMER: As a prerequisite I take it that updating to latest uniform Hadoop library is out of the question, but I hardly know anything about Hadoop. 免责声明:作为前提条件,我认为无法更新到最新的统一Hadoop库,但是我几乎不了解Hadoop。

Essentially, you are in a conflict, because you will need both libraries on the classpath at the same time at runtime, which is hard task. 本质上,您处于冲突中,因为在运行时将同时需要在类路径上同时使用两个库,这是一项艰巨的任务。 In order to have two identical classes from different sources in the same VM you will need to use at least two different classloaders. 为了在同一VM中具有来自不同来源的两个相同的类,您将需要至少使用两个不同的类加载器。

The thing to do in this scenario from a technical/architectural point of view is to decouple the two parts of the application. 从技术/体系结构的角度来看,在这种情况下要做的事情是分离应用程序的两个部分。 Either to run using different classloaders in the same VM or actually to decouple it out as heterogeneous programs exchanging messages using a shared mechanism (jms comes to mind, but there are plenty of alternatives). 可以在同一VM中使用不同的类加载器运行,也可以将其解耦为使用共享机制交换消息的异构程序(想到了jms,但是有很多替代方案)。

Since you want to explore the single VM issue you are faced with two options. 由于您要探索单个VM问题,因此面临两个选择。 Doing it manually or using an application container that supports this (OSGI). 手动执行或使用支持此功能的应用程序容器(OSGI)。 In either case you will need to at least decouple the applications in maven to differentiate their dependencies etc. 无论哪种情况,您都至少需要解耦Maven中的应用程序以区分其依赖关系等。

Manually would mean having one part of the app in the current class loader, and then load the second part from a custom classloader, so presuming the write part is offloaded in a separate jar, create a custom clasloader which loads the old Hadoop jar (and transitives where applicable) and this separate jar file. 手动意味着将应用程序的一部分放在当前的类加载器中,然后从自定义的类加载器中加载第二部分,因此假设写入部分在单独的jar中被卸载,创建一个自定义的clasloader来加载旧的Hadoop jar(以及传递(如果适用)以及此单独的jar文件。 Rather technical, but doable.. 有点技术性,但是可行。

I found a reference question using the java.util.ServiceLoader that may highlight the topic, use at your own peril. 我使用java.util.ServiceLoader找到了一个参考问题,该参考问题可能会突出显示该主题,使用后果自负。 ( Dynamically loading plugin jars using ServiceLoader ) 使用ServiceLoader动态加载插件jar

Aother decoupling solution that actually works for exactly this reason is the OSGI model, which allows jars to have their own separate runtime dependency tree in a pier-hierarchy, which essentially means that the same class may exist in multiple versions in the vm, since it is one classloader to every jar. 正是由于这个原因,另一个真正起作用的解耦解决方案是OSGI模型,该模型允许jar在Pier-hierarchy层次结构中具有其自己的单独的运行时依赖关系树,这实际上意味着同一个类可以在vm中存在多个版本。每个罐子都是一个类加载器。 However, OSGI is another beast for many other reasons, and requires a somewhat steep learning effort to really understand and utilize. 但是,由于许多其他原因,OSGI是另一种野兽,需要一定的学习努力才能真正理解和利用。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM