简体   繁体   中英

Compare Data in Two Tables

I have two tables that I need to compare to make sure the values match. For context, one is the employee's time sheet and the other is a table to holds their requests. Each has a time sheet code and a number of hours. My plan was to compare by summary to see if they match. However, I am having problems getting my logic to work perfectly every time. Without me posting my code (it is becoming a mess quickly anyway), how would you approach this? The comparison needs to be able to be run programmatically and in the end return a true/false.

This can be an RPG solution or a SQL solution.

This is what I need to make sure is true.

Table 1
02  1.5
04  16.0

Table 2
02  1.5
04  16.0

The problem is when

Table 1
02  1.5

Table 2
02  1.5
04  16.0

or when

Table 1
02  1.5
04  16.0

Table 2
02  1.5

or more so when

Table 1
02  1.5
04  16.0

Table 2

This should give the requested results:

with w1 ( wid, sumHrs )
 as ( SELECT id, sum(hours)
        FROM A   a1
        GROUP BY id ),
     w2 ( wid, sumHrs )
 as ( SELECT id, sum(hours)
        FROM B  b1
        GROUP BY id )
 select '1', w1.wid from w1 INNER JOIN w2 
                                on w1.wid = w2.wid
                      where w1.sumHrs <> w2.sumHrs 
union
 select '2', w1.wid from w1 EXCEPTION JOIN w2 
                                on w1.wid = w2.wid
union
 select '3', w2.wid from w2 EXCEPTION JOIN w1 
                                on w2.wid = w1.wid 

There might be minor detail changes needed due to adapting and simplifying from a sample I have that's close but not exact. All of the basic concepts should be clear, though.

The CTE gives a couple summary "views" for use in the full statement. The INNER JOIN gives matching IDs with mismatched HOURS. The first EXCEPTION JOIN gives rows from the first table with no matching ID in the second table, and the second EXCEPTION JOIN gives rows that aren't matched in the other direction. And the UNIONs collect all results together.

This will compare the total hours in A and B for each ID value, and only return records there the sum of B does not equal A. It treats unmatched values of B as zero.

SELECT A.id, A.hours, SUM(COALESCE(B.Hours,0))
FROM A
LEFT OUTER JOIN B
  ON B.ID = A.ID
WHERE 1=1
  AND A.id = B.id
GROUP BY A.id
HAVING A.Hours != SUM(COALESCE(B.Hours,0))

Cheers,
Daniel

In general, if you merge two tables, and the count of the resulting table is the same as the count of the individual tables, you can say that they have the same values.

I am not sure how to apply in your case, but maybe you can.

The simplest solution show lines that are in conflict:

CREATE TABLE requests ( employees_id integer, hours decimal );
CREATE TABLE hours ( employees_id integer, hours decimal );

SELECT * FROM requests, hours WHERE requests.employees_id = hours.employees_id AND requests.hours != hours.hours;

But it doesn't work if eg one 2h request needs to match two 1h markings.

I did something like this recently - here are my notes... the result set shows differences in values plus rows that are present in one table but not the other.

Step1: Create temporary tables containing the rows you want to compare.

Step2: Perform a FULL OUTER JOIN on the temp tables using the non aggregate columns as matching criterion.

DECLARE @Ta TABLE (PubA varchar(255), CampA varchar(255), RevA money)   

INSERT INTO @Ta (PubA, CampA, RevA)                     
    SELECT Publisher, [Campaign Name], SUM([Revenue USD])               
    FROM D1.dbo.Stats               
    GROUP BY Publisher, [Campaign Name]

DECLARE @Tb TABLE (PubB varchar(255), CampB varchar(255), RevB money)   

INSERT INTO @Tb (PubB, CampB, RevB)                     
    SELECT Publisher, [Campaign Name], SUM([Revenue USD])               
    FROM D2.dbo.Stats                   
    GROUP BY Publisher, [Campaign Name]

SELECT PubA, CampA, RevA, PubB, CampB, RevB, RevA-RevB DiffRev  
FROM @Ta                        
FULL OUTER JOIN @Tb                     
ON PubA=PubB and CampA=CampB                        
ORDER BY DiffRev DESC                       
GO  

I put my two SQL results into views then based on many peoples feedback already came up with this. I think this could be created into it's own stored procedure.

SELECT SUM ( ERROR_COUNT ) AS TOTAL_ERRORS INTO NUM_ERRORS FROM (
SELECT COUNT ( * ) AS ERROR_COUNT
FROM MPRLIB . V_TSHOURSUMM A EXCEPTION JOIN MPRLIB . V_REQHOURSUMM B
ON A . EM_NUMBER = B . EM_NUMBER AND A . TIMESHEET_CODE = B . TIMESHEET_CODE AND A.HOURS_SUMMARY = B . HOURS_SUMMARY
WHERE A . EM_NUMBER = EMPLOYEE_ID OR B . EM_NUMBER = EMPLOYEE_ID
UNION
SELECT COUNT ( * ) AS ERROR_COUNT
FROM MPRLIB . V_REQHOURSUMM A EXCEPTION JOIN MPRLIB . V_TSHOURSUMM B
ON A . EM_NUMBER = B . EM_NUMBER AND A . TIMESHEET_CODE = B . TIMESHEET_CODE AND A . HOURS_SUMMARY = B . HOURS_SUMMARY
WHERE A . EM_NUMBER = EMPLOYEE_ID OR B . EM_NUMBER = EMPLOYEE_ID ) TABLE

It seems to work, but seems... excessive. Thoughts? Is there a better way?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM