简体   繁体   中英

How to insert unique rows from one table to another table in postgres?

I want to insert the rows which are unique from product_table to bill_table_unique.

And then duplicated rows has to be moved from product_table to bill_table_duplicate.

Once all the data has been moved means the product_table entries have to be cleared (Only the moved rows has to be cleared)

Input :

product_table

id     bill_id     entry_date                   product_name stock
-------------------------------------------------------------------
1      009         2020-12-11 02:05:20.09876    apple        5
2      009         2020-12-11 02:05:20.09876    apple        5
3      009         2020-09-11 02:05:20.09876    apple        5
4      002         2020-12-11 02:05:20.09876    berry        5
5      002         2020-12-11 02:05:20.09876    berry        5
6      004         2020-12-11 10:05:20.09876    pineapple    1
7      006         2020-12-11 10:05:20.09876    pineapple    4

STEP 1 : to insert the rows which are unique from product_table to bill_table_unique.

Output: bill_table_unique

id     bill_id  entry_date                      product_name stock
-------------------------------------------------------------------
1      009         2020-12-11 02:05:20.09876    apple        5
2      009         2020-09-11 02:05:20.09876    apple        5
3      002         2020-12-11 02:05:20.09876    berry        5
4      004         2020-12-11 10:05:20.09876    pineapple    1
5      006         2020-12-11 10:05:20.09876    pineapple    4

STEP 2 : then duplicated rows has to be moved from product_table to bill_table_duplicate.

Output: bill_table_duplicate

id     bill_id     entry_date                   product_name stock
---------------------------------------------------------------------
1      009         2020-12-11 02:05:20.09876    apple        5
2      002         2020-12-11 02:05:20.09876    berry        5

STEP 3 : once all the data has been moved means the product_table entries have to be cleared (Only the moved rows has to be cleared)

Output: product_table

id     bill_id     entry_date                   product_name stock
--------------------------------------------------------------------

Use two queries as follows:

Insert into bill_table_unique
Select distinct bill_id, entry_date, product_name, stock
  From product_table;

Insert into bill_table_duplicate
Select bill_id, entry_date, product_name, stock 
 From
(Select bill_id, entry_date, product_name, stock,
        Row_number() over (partirion by bill_id, entry_date, product_name, stock
                           order by null) as rn
  From product_table) t
Where rn >= 2;
CREATE TABLE foo (id INTEGER PRIMARY KEY, bill_id TEXT NOT NULL, 
 entry_date TIMESTAMP NOT NULL, product_name TEXT NOT NULL, 
 stock INTEGER NOT NULL);
 
\copy foo from stdin
1   009 2020-12-11 02:05:20.09876   apple   5
2   009 2020-12-11 02:05:20.09876   apple   5
3   009 2020-09-11 02:05:20.09876   apple   5
4   002 2020-12-11 02:05:20.09876   berry   5
5   002 2020-12-11 02:05:20.09876   berry   5
6   004 2020-12-11 10:05:20.09876   pineapple   1
7   006 2020-12-11 10:05:20.09876   pineapple   4
\.

CREATE TABLE foo1 (id SERIAL PRIMARY KEY, bill_id TEXT NOT NULL, 
 entry_date TIMESTAMP NOT NULL, product_name TEXT NOT NULL, 
 stock INTEGER NOT NULL);

CREATE TABLE foo2 (id SERIAL PRIMARY KEY, bill_id TEXT NOT NULL, 
 entry_date TIMESTAMP NOT NULL, product_name TEXT NOT NULL, 
 stock INTEGER NOT NULL);

WITH a AS (
 SELECT min(id) id, bill_id, entry_date, product_name, stock
 FROM foo GROUP BY bill_id, entry_date, product_name, stock
), b AS (
 INSERT INTO foo1 (bill_id, entry_date, product_name, stock) 
 SELECT bill_id, entry_date, product_name, stock FROM a)
INSERT INTO foo2 (bill_id, entry_date, product_name, stock)
 SELECT f.bill_id, f.entry_date, f.product_name, f.stock
 FROM foo f LEFT JOIN a USING (id) WHERE a.id IS NULL;

First CTE in the WITH query computes which rows to transfer into table foo1. I picked the rows with minimum id.

Second CTE inserts the rows.

Third CTE inserts the rows that were not inserted in the previous step, these are the duplicates.

If the number of rows isn't too large, this should use a hashaggregate and hashjoin, and be much faster than a query using a sort. Also the deduplication is only done once.

STEP 3: once all the data has been moved means the product_table entries have to be cleared (Only the moved rows has to be cleared)

Since the question specifies no conditions about which rows should not be moved, then all rows in the original table have been moved, so:

TRUNCATE TABLE product_table;

I would use row_number() to enumerate the rows in the original table. Then the insertions into the other tables only rely simple filters:

WITH pt AS (
      SELECT pt.*,
             ROW_NUMBER() OVER (PARTITION BY bill_id, entry_date, product_name, stock ORDER BY id) as seqnum
      FROM product_table pt
     ),
     btu AS (
      INSERT INTO bill_table_unique (bill_id, entry_date, product_name, stock) 
          SELECT bill_id, entry_date, product_name, stock
          FROM pt
          WHERE seqnum = 1
     )
INSERT INTO bill_table_duplicate (bill_id, entry_date, product_name, stock) 
    SELECT bill_id, entry_date, product_name, stock
    FROM pt
    WHERE seqnum > 1;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM