简体   繁体   English

MySQL GROUP BY 子查询中的功能依赖

[英]MySQL GROUP BY functional dependence in subquery

I'm writing a query to find duplicate rows in a table of people (including each duplicate):我正在编写一个查询来查找人员表中的重复行(包括每个重复行):

SELECT *
FROM Person
WHERE CONCAT(firstName,lastName) IN (
    SELECT CONCAT(firstName,lastName) AS name
    FROM Person
    GROUP BY CONCAT(firstName,lastName)
    HAVING COUNT(*) > 1
)

When running this in MySQL 8.0.19 with ONLY_FULL_GROUP_BY enabled, it fails with the following error:在启用 ONLY_FULL_GROUP_BY 的 MySQL 8.0.19 中运行它时,它失败并显示以下错误:

Query 1 ERROR: Expression #1 of HAVING clause is not in GROUP BY clause and contains nonaggregated column 'Person.firstName' which is not functionally dependent on columns in GROUP BY clause;查询 1 错误:HAVING 子句的表达式 #1 不在 GROUP BY 子句中,并且包含非聚合列“Person.firstName”,该列在功能上不依赖于 GROUP BY 子句中的列; this is incompatible with sql_mode=only_full_group_by这与 sql_mode=only_full_group_by 不兼容

I can't figure out how to fix this.我不知道如何解决这个问题。 I tried changing COUNT(*) to COUNT(CONCAT(firstName,lastName)) but that didn't help.我尝试将COUNT(*)更改为COUNT(CONCAT(firstName,lastName))但这没有帮助。

What's odd is that a) it runs fine in MariaDB 10.2, with or without ONLY_FULL_GROUP_BY, and b) running the subquery by itself causes no issue.奇怪的是 a) 它在 MariaDB 10.2 中运行良好,有或没有 ONLY_FULL_GROUP_BY,并且 b) 单独运行子查询不会导致任何问题。

What am I doing wrong?我究竟做错了什么? It almost seems like a bug in MySQL.它几乎看起来像是 MySQL 中的一个错误。

[edit]: I certainly appreciate alternative solutions to my query, however I'm really interested in an answer as to why my error is occurring. [编辑]:我当然很欣赏我的查询的替代解决方案,但是我真的很想知道我的错误发生的原因。

try like below it will do the same that you tried像下面这样尝试它会做你尝试过的同样的事情

SELECT *
FROM Person
WHERE (firstName,lastName) IN (
    SELECT firstName,lastName
    FROM Person
    GROUP BY firstName,lastName
    HAVING COUNT(*) > 1
)

Do not merge fields:不合并字段:

SELECT *
FROM Person
WHERE (firstName,lastName) IN (
    SELECT firstName,lastName AS name
    FROM Person
    GROUP BY firstName,lastName
    HAVING COUNT(*) > 1
)

Or use ANY_VALUE() function:或者使用 ANY_VALUE() 函数:

SELECT *
FROM Person
WHERE CONCAT(firstName,lastName) IN (
    SELECT ANY_VALUE(CONCAT(firstName,lastName)) AS name
    FROM Person
    GROUP BY CONCAT(firstName,lastName)
    HAVING COUNT(*) > 1
)

I would write your query with exists logic:我会用存在逻辑编写您的查询:

SELECT p1.*
FROM Person p1
WHERE EXISTS (SELECT 1 FROM Person p2
              WHERE p2.firstName = p1.firstName AND
                    p2.lastName = p1.lastName AND
                    p2.id <> p1.id);

This effectively says to select every person for whom we can find another, different, person (going by the primary key id column, or whatever the PK might be), with same first and last name.这实际上是说选择每个我们可以找到另一个不同的人(通过主键id列,或者无论 PK 可能是什么),具有相同的名字和姓氏。

The following index may speed up the above query:以下索引可能会加速上述查询:

CREATE INDEX idx ON Person (lastName, firstName);

This should allow the exists lookup to evaluate quickly.这应该允许存在查找快速评估。 Note that on InnoDB, MySQL should automatically cover id by adding it to the end of the above two-column index.请注意,在 InnoDB 上,MySQL 应通过将id添加到上述两列索引的末尾来自动覆盖id

Regarding your error, I can't help but wonder if perhaps the problem is that you did not use proper aliases in the subquery, leading MySQL to think that you are referring to the columns in the outer query.关于您的错误,我不禁想知道问题是否在于您没有在子查询中使用正确的别名,导致 MySQL 认为您指的是外部查询中的列。 Try this version:试试这个版本:

SELECT p1.*
FROM Person p1
WHERE CONCAT(firstName, lastName) IN (
    SELECT CONCAT(p2.firstName, p2.lastName)
    FROM Person p2
    GROUP BY CONCAT(p2.firstName, p2.lastName)
    HAVING COUNT(*) > 1
);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM