[英]How to calculate the percentage of records when comparing hive tables?
Two Hive tables called table1 and table2 are there.有两个名为 table1 和 table2 的 Hive 表。 I got the count of both of these tables.我得到了这两个表的数量。 I created a third table called abc with the non matching records from table1 and table2.我创建了一个名为 abc 的第三个表,其中包含来自 table1 和 table2 的不匹配记录。 How can I get the percentage of number of records in table abc compare to the entire count of table1 and table2?如何获得表 abc 中的记录数与 table1 和 table2 的整个计数相比的百分比?
1. select count(*) from table1 A
2. select count(*) from table2 B
3. create table dbo.abc as
select A.column1, A.columnb from table A
inner join table B
where A.column3 <> B.column3
4. how to get the percentage of records in table abc?
for example: count(*) from abc
-------------------- *100
count(*) from A + B
Expected output is:预期输出为:
Example:
number_of_non_matching_records = 20%
Are you trying to do this in one statement?你想在一个声明中做到这一点吗?
select count(*) as combos_in_ab,
sum(case when a.column3 <> b.column3 then 1 else 0 end) as combos_in_3,
avg(case when a.column3 <> b.column3 then 1.0 else 0 end) as percent_in_3
from a cross join
b;
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.