[英]sql delete all but 2 duplicates
I want to be able to limit the amount of duplicate records in a mySQL database table to 2. 我希望能够将mySQL数据库表中的重复记录数量限制为2。
(Excluding the id
field which is auto increment) (不包括自动递增的
id
字段)
My table is set up like 我的桌子摆起来像
id city item
---------------------
1 Miami 4
2 Detroit 5
3 Miami 4
4 Miami 18
5 Miami 4
So in that table, only row 5 would be deleted. 因此,在该表中,将仅删除第5行。
How can I do this? 我怎样才能做到这一点?
MySQL has some foibles when reading and writing to the same table. 当读取和写入同一张表时,MySQL有一些缺点。 So I don't actually know if this will work, the syntax is fine in many implementations of SQL, but I don't know if it's MySQL friendly...
所以我实际上不知道这是否行得通,在许多SQL实现中语法都不错,但是我不知道它是否对MySQL友好...
DELETE
yourTable
WHERE
1 < (SELECT COUNT(*)
FROM yourTable as Lookup
WHERE city = yourTable.city AND item = yourTable.item AND id < yourTable.id)
EDIT 编辑
Amazingly convoluted, but worth a try? 令人费解的令人费解,但值得一试吗?
DELETE
yourTable
FROM
yourTable
INNER JOIN
(
SELECT
id
FROM
(
SELECT
id
FROM
yourTable
WHERE
1 < (SELECT COUNT(*)
FROM yourTable as Lookup
WHERE city = yourTable.city AND item = yourTable.item AND id < yourTable.id)
)
AS inner_deletes
)
AS deletes
ON deletes.id = yourTable.id
我认为您的问题在于,您的代码和/或表结构都允许插入重复项,并且您在问何时应该真正修复数据库和/或代码。
i think a better solution is avoid allow more than 5 registers, you have to implement a validation where if select count(*) > 3 you will not accept the new insert. 我认为一个更好的解决方案是避免允许超过5个寄存器,您必须实施一个验证,如果select count(*)> 3,则您将不接受新插入。
because if you want to do this into the data tier, you have to use a stored procedure , because first you need to identify all the register with more than 3 registers and delete only the last . 因为如果要在数据层中执行此操作,则必须使用存储过程,因为首先需要标识具有3个以上寄存器的所有寄存器,而仅删除last。 Saludos
Saludos
Due to MySQL being notoriously difficult when it comes to updating queried tables (see for example the answers from Dems), the best I can figure out is sadly more than one statement but on the plus side fairly readable; 由于MySQL在更新查询表时非常困难(例如,参见Dems的答案),因此我能弄清的最好的就是一个以上的语句,但从正面看还是很容易理解的。
CREATE TEMPORARY TABLE Dump AS SELECT id FROM table1 WHERE id NOT IN
(SELECT MIN(id) FROM table1 GROUP BY city,item UNION
SELECT MAX(id) FROM table1 GROUP BY city,item);
DELETE FROM table1 where id in (select * from Dump);
DROP TABLE DUMP;
Not sure if it was important which duplicate was removed, this keeps the first and last. 不知道删除哪个重复是否很重要,这将保留第一个和最后一个。
In your reply to Joachim's answer, you ask about saving 3 or 5 rows, this is one way to accomplish it. 在答复Joachim的答案时,您询问保存3或5行,这是完成此操作的一种方法。 Depending on how you are using this database, you could either call this in a loop, or you could turn it into a stored procedure.
根据使用该数据库的方式,您可以循环调用此方法,也可以将其转换为存储过程。 Either way, you would continue to run this entire block of code until Rows Affected = 0:
无论哪种方式,您都将继续运行整个代码块,直到受影响的行= 0:
drop table if exists TempTable;
create table TempTable
select city, item,
count(*) as record_count,
min(id) as ItemToDrop -- this could be changed to max() if you
-- want to delete new stuff instead
from YourTable
group by city, item
having count(*) > 2; -- This value = number of rows you save
delete from YourTable
where id in (select ItemToDrop from TempTable);
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.