![](/img/trans.png)
[英]Creating a view with two tables, where second table doesn't have unique IDs
[英]Comparing two tables that doesn't have unique key
我需要比較兩個表數據,並檢查哪個屬性不匹配,表具有相同的表定義,但是問題是我沒有唯一的鍵來比較。 我嘗試使用
CONCAT(CONCAT(CONCAT(table1.A, Table1.B))
=CONCAT(CONCAT(CONCAT(table2.A, Table2.B))
但仍然面對重復的行,也嘗試在少數列上使用NVL,但沒有用
SELECT
UT.cat,
PD.cat
FROM
EM UT, EM_63 PD
WHERE
NVL(UT.cat, 1) = NVL(PD.cat, 1) AND
NVL(UT.AT_NUMBER, 1) = NVL(PD.AT_NUMBER, 1) AND
NVL(UT.OFFSET, 1) = NVL(PD.OFFSET, 1) AND
NVL(UT.PROD, 1) = NVL(PD.PROD, 1)
;
一個表中有34k條記錄,另一表中有35k條記錄,但是如果運行上述查詢,則行數為300萬。
表中的列:
COUNTRY
CATEGORY
TYPE
DESCRIPTION
樣本數據 :
表格1 :
COUNTRY CATEGORY TYPE DESCRIPTION
US C T1 In
IN A T2 OUT
B C T2 IN
Y C T1 INOUT
表2:
COUNTRY CATEGORY TYPE DESCRIPTION
US C T2 In
IN B T2 Out
Q C T2 IN
預期產量:
column Matched unmatched
COUNTRY 2 1
CATEGORY 2 1
TYPE 2 1
DESCRIPTION 3 0
在最一般的情況下(當您可能有重復的行,並且您想查看哪些表在一個表中存在而另一表中不存在,以及還希望哪些行在兩個表中都存在,但是該行在第一個表中存在3次)但另外5次):
這是一個固定的“最佳解決方案”的非常普遍的問題,盡管出於很多原因,它似乎仍未被大多數人了解,盡管它是在多年前在AskTom上開發的,並且已經被提出了無數次。
您不需要聯接,不需要任何類型的唯一鍵,也不需要多次讀取任何一個表。 想法是添加兩列以顯示每行來自哪個表,執行UNION ALL,然后除“ source”列之外的所有列都按GROUP BY並顯示每個表的計數。 像這樣:
select count(t_1) as count_table_1, count(t_2) as count_table_2, col1, col2, ...
from (
select 'x' as t_1, null as t_2, col1, col2, ...
from table_1
union all
select null as t_1, 'x' as t_2, col1, col2, ...
from table_2
)
group by col1, col2, ...
having count(t_1) != count(t_2)
;
從此查詢開始,檢查這4列是否構成鍵。
select occ_total,occ_ut,occ_pd
,count(*) as records
from (select count (*) as occ_total
,count (case tab when 'UT' then 1 end) as occ_ut
,count (case tab when 'PD' then 1 end) as occ_pd
from select 'UT' as tab,cat,AT_NUMBER,OFFSET,PROD from EM
union all select 'PD' ,cat,AT_NUMBER,OFFSET,PROD from EM_63 PD
) t
group by cat,AT_NUMBER,OFFSET,PROD
) t
group by occ_total,occ_ut,occ_pd
order by records desc
;
選擇“鍵”后,可以使用以下查詢查看屬性的值
select count (*) as occ_total
,count (case tab when 'UT' then 1 end) as occ_ut
,count (case tab when 'PD' then 1 end) as occ_pd
,count (distinct att1) as cnt_dst_att1
,count (distinct att2) as cnt_dst_att2
,count (distinct att3) as cnt_dst_att3
,...
,listagg (case tab when 'UT' then att1 end) within group (order by att1) as att1_vals_ut
,listagg (case tab when 'PD' then att1 end) within group (order by att1) as att1_vals_pd
,listagg (case tab when 'UT' then att2 end) within group (order by att2) as att2_vals_ut
,listagg (case tab when 'PD' then att2 end) within group (order by att2) as att2_vals_pd
,listagg (case tab when 'UT' then att3 end) within group (order by att3) as att3_vals_ut
,listagg (case tab when 'PD' then att3 end) within group (order by att3) as att3_vals_pd
,...
from select 'UT' as tab,cat,AT_NUMBER,OFFSET,PROD,att1,att2,att3,... from E M
union all select 'PD' ,cat,AT_NUMBER,OFFSET,PROD,att1,att2,att3,... from EM_63 PD
) t
group by cat,AT_NUMBER,OFFSET,PROD
;
CONCAT
的問題是,如果您的數據看起來像這樣,則可能會得到無效的匹配項:
table1.A = '123'
table1.B = '456'
串聯為: '123456'
table2.A = '12'
table2.B = '3456'
也串聯為: '123456'
您必須分別比較字段: table1.A = table2.A AND table1.B = table2.B
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.