简体   繁体   中英

Hive queries failing on Tez but succeeding on Map-Reduce when connecting from Beeline

I am running into a weird error. I am running a simple select * query with a where clause, following is the summary of query execution status

  1. Connecting to Hive from EMR (Tez engine) - succeeding
  2. Connecting to Hive from EMR (MR engine) - succeeding
  3. Connecting to Hive from Beeline (Tez engine) - failing
  4. Connecting to Hive from Beeline (MR engine) - succeeding

I need to solve for point 3. This is the error trace I am getting and unable to find what the root cause of this failure is and what this error log is trying to convey.

    at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:380)
    at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:257)
    at org.apache.hive.service.cli.operation.SQLOperation.access$800(SQLOperation.java:91)
    at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:348)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1840)
    at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:362)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)' SQL<select `ID`, `ISDELETED`, `ACCOUNTID`, `CREATEDBYID`, `CREATEDDATE`, `FIELD`, `OLDVALUE`, `NEWVALUE`, `AUDIT_UPD_TS`, `SRC_OP_TYP`, `GG_INGEST_TS` from `t4i_ent_sfdc_b2b_psa`.`sf_accounthistory` x WHERE SRC_OP_TYP='NA'>```

I was able to solve this. The problem was I was connecting my application to Hive via JDBC without specifying a user. For queries where simple streaming of data was required, it was succeeding, but where Map-Reduce jobs were being triggered to write to HDFS, the writing operation was failing with the error

Failed to execute tez graph.
    org.apache.hadoop.security.AccessControlException: Permission denied: user=anonymous, access=WRITE, inode="/user":hdfs:hadoop:drwxr-xr-x

To resolve this, I added the user=hadoop; in the JDBC URL and the queries run fine now.

Try invoking beeline ( Tez engine ) as below and then run your query:

beeline -u "jdbc:hive2://<host>:<port>,/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2-batch?tez.queue.name=<yarn-queue-name>"

If above doesn't work then try to fix any issue in SQL. I see 'x' before Where clause in your sql query, that may be the issue. Try to remove that and run your query.

`sf_accounthistory` x WHERE SRC_OP_TYP='NA'

Hope this is helpful

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM