简体   繁体   English

如何使用 DELETE FROM 和子查询删除重复项?

[英]How do I remove duplicates using DELETE FROM and a sub-query?

Notes:笔记:

  • The return from SELECT version() is 10.5.12-MariaDB-log SELECT version()的返回是 10.5.12-MariaDB-log
  • Default collation: utf8mb4_unicode-ci默认排序规则:utf8mb4_unicode-ci
  • Default charset: utf8mb4默认字符集:utf8mb4
  • Queries run using MySQL Workbench for Ubuntu Linux 8.0.29使用 MySQL Workbench for Ubuntu Linux 8.0.29 运行查询

My goal is to delete duplicated items in a table.我的目标是删除表中的重复项。 I do not have any other table to use as a reference to check duplicates.我没有任何其他表格可用作检查重复项的参考。 I created a simple query and subquery that returns expected results:我创建了一个返回预期结果的简单查询和子查询:


    SELECT * FROM messages WHERE id NOT IN
        (SELECT id FROM
            messages
        WHERE
            uid = '11899414026778263'
        GROUP BY message_id , uid
        ORDER BY created_at);

Despite setting SQL_SAFE_UPDATES to 0, a DELETE operation using the same data fails.尽管将 SQL_SAFE_UPDATES 设置为 0,但使用相同数据的 DELETE 操作仍会失败。 I get a "QUERY Interrupted" message.我收到“查询中断”消息。

    SET SQL_SAFE_UPDATES = 0;
    DELETE FROM messages WHERE id NOT IN
        (SELECT id FROM
            messages
        WHERE
            uid = '11899414026778263'
        GROUP BY message_id , uid
        ORDER BY created_at);
    SET SQL_SAFE_UPDATES = 1;

If I replace DELETE with SELECT * , the query returns results.如果我将DELETE替换为SELECT * ,则查询将返回结果。 Another StackOverflow answer said that querying based on a sub-query does not work in MySQL.另一个 StackOverflow 回答说基于子查询的查询在 MySQL 中不起作用。 Others say to use another table as reference instead of a subquery.其他人说使用另一个表作为参考而不是子查询。 DELETE query results in 'Query Interrupted' MySQL Workbench? DELETE 查询导致“查询中断”MySQL Workbench?

This method works in some SQL implementations based on these answers and websites:此方法适用于基于这些答案和网站的一些 SQL 实现:

order by is applied after grouping, so your order by is not sufficient to select the id with the lowest created_at for each group. order by 在分组后应用,因此您的 order by 不足以为每个组选择 created_at 最低的 id。 Your query will fail under ONLY_FULL_GROUP_BY because the id returned by the subselect will be arbitrary within each group.您的查询将在 ONLY_FULL_GROUP_BY 下失败,因为子选择返回的 id 在每个组中都是任意的。 You want to use first_value instead.您想改用 first_value 。

But it's easier just to not use a subquery:但是不使用子查询更容易:

delete m
from messages m
# is there a message we would prefer over this one?
inner join messages m2 on (m2.uid,m2.message_id)=(m.uid,m.message_id) and (m2.created_at,m2.id) < (m.created_at,m.id)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM