简体   繁体   中英

SQL - Check table for new rows?

I have two tables, for example:

Table A                                  Table B
=======                                  =======

Name         | Color                     Name         | Color
----------------------                   ----------------------
Mickey Mouse | red                       Mickey Mouse | red
Donald Duck  | green                     Donald Duck  | blue
Donald Duck  | blue                      Minnie       | red
Goofy        | black
Minnie       | red

Table A is my source table and B is the destination table. Now I need a query which finds all the different (additional) rows in table A so table B can be updated with those rows. So I need a query which finds me the following rows from table A:

Name         | Color  
----------------------
Donald Duck  | green  
Goofy        | black

What is a good approach for such a query? It should be as efficient as possible (avoid too many joins). Thanks for any help!

I would use a NOT EXISTS structure.

SELECT Name, Color
FROM TableA
WHERE NOT EXISTS (
SELECT 1 FROM TableB
WHERE TableA.Name = TableB.Name 
AND TableA.Color = TableB.Color)
SELECT a.Name, a.Color
FROM a LEFT OUTER JOIN b ON (a.Name = b.Name AND a.Color = b.Color)
WHERE b.Name IS NULL AND b.Color IS NULL
Select A.Name, A.Color
From A left join B on A.Name = B.Name and A.Color = B.Color
Where B.Name is null

In SQL Server 2008, you can use the EXCEPT operator, which is used like a UNION but returns everything from the first query, except where it is also in the second:

SELECT * FROM TABLEA EXCEPT SELECT * FROM TABLEB

I understand that Oracle has a MINUS operator that does the same thing.

You can use the EXCEPT operator, which is the opposite of UNION. In Oracle, the equivalent is MINUS.

SELECT * FROM TABLE_A
EXCEPT
SELECT * FROM TABLE_B

I usually add a column "updated TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP", and I use its value to check when a new row is inserted or an existing one is modified.

In an application I developed I needed to solve a problem similar to yours, so I saved somewhere the MAX(updated) of B, and then with a query I identified all the rows where A.updated>B.updated, and the result are all new+modified rows.

Since the field default value is CURRENT_TIMESTAMP and it auto-updates "ON UPDATE" you never have to explicitly set its value.

A NOT EXISTS subquery should resolve to an outer join:

SELECT Name, Color
FROM TableA
WHERE NOT EXISTS (
  SELECT 1
  FROM TableB
  WHERE TableA.Color = TableB.Color
  AND   TableA.Name  = TableB.Name
)

Or you could just use an outer join directly:

SELECT TableA.Name, TableA.Color
FROM TableA
LEFT OUTER JOIN TableB
  ON  TableA.Name  = TableB.Name
  AND TableA.Color = TableB.Color
WHERE TableB.Name IS NULL

They should be equally performant; it's a question of which you feel is more intuitive.

There are many correct answers up already, but I want to bring up a philosophical point:

Is this database schema really viable in a production environment inside a single schema?

Does it really make sense to have two tables containing data, and then writing a query to compare one to the other? I think it would make sense to have just a single table, and perhaps put in a date identifier to find records added after a certain point.

The only situation I can think of where you'd want to do this is where you have two separate databases and you want to "synchronize" them, or when you would want to find differences between the two, say, comparing a backup and production.

In Oracle you would probably use:

MERGE INTO b USING
  (SELECT name, color 
     FROM a) src 
 ON (src.name = b.name AND color = src.color)
 WHEN NOT MATCHED THEN 
    INSERT (name, color)
    VALUES (src.name, src.color);

If your table has a primary key (do you really have tables without one?) like NAME, and you would like to INSERT or UPDATE depending on the existence of the record in table B, you would use:

MERGE INTO b USING 
  (SELECT name, color 
     FROM a) src 
  ON (src.name = b.name) 
  WHEN NOT MATCHED THEN 
    INSERT (name, color)
    VALUES (src.name, src.color) 
  WHEN MATCHED THEN 
    UPDATE 
    SET color = src.color; 

I take it that SQL Server also has a MERGE statement or similar.

SELECT TableA.Name, TableA.Color FROM TableA WHERE TableA.Name + TableA.Color NOT IN (SELECT TableB.Name + TableB.Color FROM TableB)

INSERT INTO B 
SELECT a.Name, a.Color
FROM A a
LEFT JOIN B b ON a.Name = b.Name AND a.Color = b.Color
WHERE b.Color IS NULL AND b.Name IS NULL
    select 
        name, 
        color
    from 
        tableA 
    where 
        concat(name, '|',color)  
        not in 
        (
            select 
                concat(name, '|',color) 
            from 
                tableB
        )-- not in 

Working coy can be found at db fiddle

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM