简体   繁体   中英

oozie-hive beeline not working with kerberos

We have recently migrated from our old HDP cluster(without kerberos) to new HDP cluster(having kerberos). We are facing some authentication issues while running our ozzie jobs on new clutser. Please refer to workflow.xml below. The first action 'hive-101' works fine, however the second action hive-102 fails.

<credentials>
    <credential name="hs2-creds" type="hive2">
        <property>
            <name>hive2.server.principal</name>
            <value>${jdbcPrincipal}</value>
        </property>
        <property>
            <name>hive2.jdbc.url</name>
            <value>${jdbcURL}</value>
        </property>
    </credential>
</credentials>

<start to="hive-101"/>

<action name="hive-101" cred="hs2-creds">
    <hive2 xmlns="uri:oozie:hive2-action:0.2">
        <jdbc-url>${jdbcURL}</jdbc-url>
        <password>${hivepassword}</password>
          <query>SELECT count(*)  FROM table1;</query>
    </hive2>
    <ok to="hive-102"/>
    <error to="fail"/>
</action>


<action name="hive-102" retry-max="${maxretry}" retry-interval="${retryinterval}">
    <shell xmlns="uri:oozie:shell-action:0.3">
        <exec>beeline</exec>
        <argument>jdbc:hive2://zk01.abc.com:2181,zk02.abc.com:2181,zk03.abc.com:2181/${hivedatabase};principal=hive/_HOST@ABC.COM;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2</argument>
        <argument>--outputformat=vertical</argument>
        <argument>--silent=true</argument>
        <argument>-e</argument>
        <argument>
            SELECT max(id) as mx_id FROM ${hivedatabase}.table1;

        </argument>
        <capture-output/>
    </shell>
    <ok to="end"/>
    <error to="fail"/>
</action>

Below are the error details

ERROR transport.TSaslTransport: SASL negotiation failure
javax.security.sasl.SaslException: GSS initiate failed
at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) ~[?:1.8.0_212]

Caused by: org.ietf.jgss.GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147) ~[?:1.8.0_212]
at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:122) ~[?:1.8.0_212]

WARN jdbc.HiveConnection: Failed to connect to nn02.abc.com:10000
WARN jdbc.HiveConnection: Could not open client transport with JDBC Uri: jdbc:hive2://nn02.abc.com:10000/db_test;principal=hive/_HOST@ABC.COM;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2: GSS initiate failed Retrying 0 of 1
ERROR transport.TSaslTransport: SASL negotiation failure
javax.security.sasl.SaslException: GSS initiate failed
at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) ~[?:1.8.0_212]

Caused by: org.ietf.jgss.GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147) ~[?:1.8.0_212]

The shell action will run on an arbitrary data node as the Unix user who started the Oozie workflow. That user who tries to run the shell command won't be automatically authenticated with Kerberos.

I believe you will have to place a Kerberos keytab for the user on each data node. Then your Oozie shell action will need to run a script that runs a kinit using the keytab and then runs the beeline command.

From Apache Oozie by Mohammad Kamrul Islam and Aravind Srinivasan

On a nonsecure Hadoop cluster, the shell command will execute as the Unix user who runs the TaskTracker (Hadoop 1) or the YARN container (Hadoop 2). This is typically a system-defined user. On secure Hadoop clusters running Kerberos, the shell commands will run as the Unix user who submitted the workflow containing the action.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM