简体   繁体   中英

Deleting all but the most recent entry from single SQL table

I have a single SQL table that contains multiple entries for each customerID (some customerID's only have one entry which I want to keep). I need to remove all but the most recent entry per customerID, using the invoiceDate field as my marker.

So I need to go from this:

+------------+-------------+-----------+
| customerID | invoiceDate | invoiceID |
+------------+-------------+-----------+
|          1 |  1393995600 |       xx  |
|          1 |  1373688000 |       xx  |
|          1 |  1365220800 |       xx  |
|          2 |  1265220800 |       xx  |
|          2 |  1173688000 |       xx  |
|          3 |  1325330800 |       xx  |
+------------+-------------+-----------+

To this:

+------------+-------------+-----------+
| customerID | invoiceDate | invoiceID |
+------------+-------------+-----------+
|          1 |  1393995600 |       xx  |
|          2 |  1265220800 |       xx  |
|          3 |  1325330800 |       xx  |
+------------+-------------+-----------+

Any guidance would be greatly appreciated!

  1. Write a query to select all the rows you want to delete:
SELECT * FROM t
WHERE invoiceDate NOT IN (
    SELECT MAX(invoiceDate)
    -- "FROM t AS t2" isn't supported by MySQL, see http://stackoverflow.com/a/14302701/227576
    FROM (SELECT * FROM t) AS t2
    WHERE t2.customerId = t.customerId
    GROUP BY t2.customerId
)

This may take a long time on a big database.

  1. If you're satisfied, change the query to a DELETE statement:
DELETE FROM t
WHERE invoiceDate NOT IN (
    SELECT MAX(invoiceDate)
    -- "FROM t AS t2" isn't supported by MySQL, see http://stackoverflow.com/a/14302701/227576
    FROM (SELECT * FROM t) AS t2
    WHERE t2.customerId = t.customerId
    GROUP BY t2.customerId
)

See http://sqlfiddle.com/#!9/6e031/1

If you have multiple rows whose date is the most recent for the same customer, you would have to look for duplicates and decide which one you want to keep yourself. For instance, look at customerId 2 on the SQL fiddle link above.

Let us asume that the table name is transaction_table .

create table test1 AS
select * from (
  select * from transaction_table order by customerID, invoiceDate desc) temp
group by customerID

You will have the output data in test1 table.

Try out this one

  with todelete as
(
            select 
            CustomerId, InvoiceId, InvoiceDate, Row_Number() over (partition by CustomerId  order by InvoiceDate desc) as Count
             from DeleteDuplicate
)


delete from todelete
where count > 1
delete from ex_4 where
rowid in
(select rowid
from ex_4 a 
where to_date(invoicedate,'DDMMYYYY') = (select max(to_date(invoicedate,'DDMMYYYY')) from ex_4 b where a.customerid != b.customerid))

This is how it will be done in oracle.This query will delete all but most recently added row.Looking at your table structure i am assuming that the invoicedate column is varchar2 type so converting it to date used to_date function here

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM