比較兩個沒有唯一鍵的表

Question

我需要比較兩個表數據，並檢查哪個屬性不匹配，表具有相同的表定義，但是問題是我沒有唯一的鍵來比較。 我嘗試使用

CONCAT(CONCAT(CONCAT(table1.A, Table1.B))
=CONCAT(CONCAT(CONCAT(table2.A, Table2.B))

但仍然面對重復的行，也嘗試在少數列上使用NVL，但沒有用

SELECT  
    UT.cat,
    PD.cat
FROM 
    EM UT, EM_63 PD 
WHERE 
    NVL(UT.cat, 1) = NVL(PD.cat, 1) AND
    NVL(UT.AT_NUMBER, 1) = NVL(PD.AT_NUMBER, 1) AND
    NVL(UT.OFFSET, 1) = NVL(PD.OFFSET, 1) AND  
    NVL(UT.PROD, 1) = NVL(PD.PROD, 1)
;

一個表中有34k條記錄，另一表中有35k條記錄，但是如果運行上述查詢，則行數為300萬。

表中的列：

COUNTRY       
CATEGORY   
TYPE    
DESCRIPTION

樣本數據：

表格1 ：

COUNTRY  CATEGORY TYPE   DESCRIPTION       
US          C       T1      In
IN          A       T2      OUT
B           C       T2      IN
Y           C       T1      INOUT

表2：

COUNTRY  CATEGORY TYPE   DESCRIPTION    
US          C       T2      In
IN          B        T2     Out
Q           C       T2      IN

預期產量：

column      Matched  unmatched
COUNTRY         2       1
CATEGORY        2       1
TYPE            2       1
DESCRIPTION     3       0

Answer 1

在最一般的情況下（當您可能有重復的行，並且您想查看哪些表在一個表中存在而另一表中不存在，以及還希望哪些行在兩個表中都存在，但是該行在第一個表中存在3次）但另外5次）：

這是一個固定的“最佳解決方案”的非常普遍的問題，盡管出於很多原因，它似乎仍未被大多數人了解，盡管它是在多年前在AskTom上開發的，並且已經被提出了無數次。

您不需要聯接，不需要任何類型的唯一鍵，也不需要多次讀取任何一個表。 想法是添加兩列以顯示每行來自哪個表，執行UNION ALL，然后除“ source”列之外的所有列都按GROUP BY並顯示每個表的計數。 像這樣：

select   count(t_1) as count_table_1, count(t_2) as count_table_2, col1, col2, ...
from     (
           select 'x' as t_1, null as t_2, col1, col2, ... 
             from table_1
           union all
           select null as t_1, 'x' as t_2, col1, col2, ...
             from table_2
         )
group by col1, col2, ...
having   count(t_1) != count(t_2)
;

Answer 2

從此查詢開始，檢查這4列是否構成鍵。

select      occ_total,occ_ut,occ_pd
           ,count(*)                as records

from       (select      count (*)                               as occ_total
                       ,count (case tab when 'UT' then 1 end)   as occ_ut
                       ,count (case tab when 'PD' then 1 end)   as occ_pd

            from                    select 'UT' as tab,cat,AT_NUMBER,OFFSET,PROD from EM
                        union all   select 'PD'       ,cat,AT_NUMBER,OFFSET,PROD from EM_63 PD
                        ) t

            group by    cat,AT_NUMBER,OFFSET,PROD
            ) t

group by    occ_total,occ_ut,occ_pd     

order by    records desc
;

選擇“鍵”后，可以使用以下查詢查看屬性的值

select      count (*)                               as occ_total
           ,count (case tab when 'UT' then 1 end)   as occ_ut
           ,count (case tab when 'PD' then 1 end)   as occ_pd

           ,count (distinct att1)                   as cnt_dst_att1
           ,count (distinct att2)                   as cnt_dst_att2
           ,count (distinct att3)                   as cnt_dst_att3
           ,...
           ,listagg (case tab when 'UT' then att1 end) within group (order by att1) as att1_vals_ut
           ,listagg (case tab when 'PD' then att1 end) within group (order by att1) as att1_vals_pd
           ,listagg (case tab when 'UT' then att2 end) within group (order by att2) as att2_vals_ut
           ,listagg (case tab when 'PD' then att2 end) within group (order by att2) as att2_vals_pd
           ,listagg (case tab when 'UT' then att3 end) within group (order by att3) as att3_vals_ut
           ,listagg (case tab when 'PD' then att3 end) within group (order by att3) as att3_vals_pd  
           ,...

from                    select 'UT' as tab,cat,AT_NUMBER,OFFSET,PROD,att1,att2,att3,... from E M
            union all   select 'PD'       ,cat,AT_NUMBER,OFFSET,PROD,att1,att2,att3,... from EM_63 PD
            ) t

group by    cat,AT_NUMBER,OFFSET,PROD
;

Answer 3

CONCAT的問題是，如果您的數據看起來像這樣，則可能會得到無效的匹配項：

table1.A = '123'
table1.B = '456'

串聯為： '123456'

table2.A = '12'
table2.B = '3456'

也串聯為： '123456'

您必須分別比較字段： table1.A = table2.A AND table1.B = table2.B

比較兩個沒有唯一鍵的表

問題描述

3 個解決方案

解決方案1
2 已采納 2016-11-25 15:46:43

解決方案2
1 2016-11-25 14:55:24

解決方案3
0 2016-11-25 13:02:13

比較兩個沒有唯一鍵的表

問題描述

3 個解決方案

解決方案1 2 已采納 2016-11-25 15:46:43

解決方案2 1 2016-11-25 14:55:24

解決方案3 0 2016-11-25 13:02:13

解決方案1
2 已采納 2016-11-25 15:46:43

解決方案2
1 2016-11-25 14:55:24

解決方案3
0 2016-11-25 13:02:13