MySQL GROUP BY functional dependence in subquery

Question

I'm writing a query to find duplicate rows in a table of people (including each duplicate):

SELECT *
FROM Person
WHERE CONCAT(firstName,lastName) IN (
    SELECT CONCAT(firstName,lastName) AS name
    FROM Person
    GROUP BY CONCAT(firstName,lastName)
    HAVING COUNT(*) > 1
)

When running this in MySQL 8.0.19 with ONLY_FULL_GROUP_BY enabled, it fails with the following error:

Query 1 ERROR: Expression #1 of HAVING clause is not in GROUP BY clause and contains nonaggregated column 'Person.firstName' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by

I can't figure out how to fix this. I tried changing COUNT(*) to COUNT(CONCAT(firstName,lastName)) but that didn't help.

What's odd is that a) it runs fine in MariaDB 10.2, with or without ONLY_FULL_GROUP_BY, and b) running the subquery by itself causes no issue.

What am I doing wrong? It almost seems like a bug in MySQL.

[edit]: I certainly appreciate alternative solutions to my query, however I'm really interested in an answer as to why my error is occurring.

Answer 1

try like below it will do the same that you tried

SELECT *
FROM Person
WHERE (firstName,lastName) IN (
    SELECT firstName,lastName
    FROM Person
    GROUP BY firstName,lastName
    HAVING COUNT(*) > 1
)

Answer 2

Do not merge fields:

SELECT *
FROM Person
WHERE (firstName,lastName) IN (
    SELECT firstName,lastName AS name
    FROM Person
    GROUP BY firstName,lastName
    HAVING COUNT(*) > 1
)

Or use ANY_VALUE() function:

SELECT *
FROM Person
WHERE CONCAT(firstName,lastName) IN (
    SELECT ANY_VALUE(CONCAT(firstName,lastName)) AS name
    FROM Person
    GROUP BY CONCAT(firstName,lastName)
    HAVING COUNT(*) > 1
)

Answer 3

I would write your query with exists logic:

SELECT p1.*
FROM Person p1
WHERE EXISTS (SELECT 1 FROM Person p2
              WHERE p2.firstName = p1.firstName AND
                    p2.lastName = p1.lastName AND
                    p2.id <> p1.id);

This effectively says to select every person for whom we can find another, different, person (going by the primary key id column, or whatever the PK might be), with same first and last name.

The following index may speed up the above query:

CREATE INDEX idx ON Person (lastName, firstName);

This should allow the exists lookup to evaluate quickly. Note that on InnoDB, MySQL should automatically cover id by adding it to the end of the above two-column index.

Regarding your error, I can't help but wonder if perhaps the problem is that you did not use proper aliases in the subquery, leading MySQL to think that you are referring to the columns in the outer query. Try this version:

SELECT p1.*
FROM Person p1
WHERE CONCAT(firstName, lastName) IN (
    SELECT CONCAT(p2.firstName, p2.lastName)
    FROM Person p2
    GROUP BY CONCAT(p2.firstName, p2.lastName)
    HAVING COUNT(*) > 1
);

MySQL GROUP BY functional dependence in subquery

Question

3 answers

solution1
1 2020-02-28 06:53:14

solution2
1 2020-02-28 06:54:36

solution3
1 2020-02-28 06:54:42

MySQL GROUP BY functional dependence in subquery

Question

3 answers

solution1 1 2020-02-28 06:53:14

solution2 1 2020-02-28 06:54:36

solution3 1 2020-02-28 06:54:42

solution1
1 2020-02-28 06:53:14

solution2
1 2020-02-28 06:54:36

solution3
1 2020-02-28 06:54:42