简体   繁体   中英

How to calculate the percentage of records when comparing hive tables?

Two Hive tables called table1 and table2 are there. I got the count of both of these tables. I created a third table called abc with the non matching records from table1 and table2. How can I get the percentage of number of records in table abc compare to the entire count of table1 and table2?

1. select count(*) from table1 A

2. select count(*) from table2 B
3. create table dbo.abc as 
   select A.column1, A.columnb from table A
   inner join table B
   where A.column3 <> B.column3

4. how to get the percentage of records in table abc? 
    for example:   count(*) from abc 
                   -------------------- *100
                   count(*) from A + B

Expected output is:

Example: 
  number_of_non_matching_records = 20%

Are you trying to do this in one statement?

select count(*) as combos_in_ab,
       sum(case when a.column3 <> b.column3 then 1 else 0 end) as combos_in_3,
       avg(case when a.column3 <> b.column3 then 1.0 else 0 end) as percent_in_3
from a cross join
     b;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM