简体   繁体   中英

Hive table count is showing 0 via java JDBC driver

I am getting 0 record when i am accessing hive table from JDBC via java. But the same query from beeline it is working fine and showing some number. what could be the reason.?

The behaviour you are seeing may be due to stale stats for certain tables in HiveMetastore and if these are referred by the queries.

To test this, you can check by running set hive.compute.query.using.stats; on both Beeline-Hive and JDBC Client session to see if the value set for the property is TRUE or FALSE.

If TRUE, the query would fetch the statistics from HiveMetastore. (this is usually faster since it fetches the count from HiveMetastore and not by executing a MapReduce job. But it may return incorrect/stale count if the statistics are not updated in HiveMetastore for the table)

If FALSE, the query runs a MapReduce as a part of the execution and performs the count from the records present in the data files in HDFS. This is time consuming when compared to the previous one but returns accurate results.

Solution:

  1. You can set the property hive.compute.query.using.stats to false by running the below statement in Beeline-Hive and JDBC Client sessions. This way, Hive would perform count on the basis of data present in HDFS through a MapReduce job.
set hive.compute.query.using.stats=false;

OR

  1. Compute statistics for the tables manually by running the below statement in either Beeline-Hive or JDBC Client sessions. This will update the HiveMetastore with updated statistics. After this count(*) should return correct results in any Hive sessions for that table.
ANALYZE TABLE <database_name>.<table_name> COMPUTE STATISTICS;

Hope this helps!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM