简体   繁体   English

MySQL计数仅返回一个结果,除非使用group by

[英]mySQL count only returning one result unless using group by

using the SQL query 使用SQL查询

select u.name,count(u.name) as 'followers'
from user u,follow f
where u.type = 'c' AND f.followee = u.email
group by u.name

gets me the correct value for all users in my database, however, the exact same query without the group by line only gives me the first value. 为我的数据库中的所有用户提供了正确的值,但是,没有逐行分组的完全相同的查询仅给我第一个值。 I am learning SQL for the first time and was having a hard time figuring out why this is. 我是第一次学习SQL,并且很难弄清楚为什么这样做。

当您使用count without group by计数时,它将对所有记录进行计数并返回single line而当您使用count with group by它将group the users的名称group the users进行group the users并返回each group的计数。

the exact same query without the group by line only gives me the first value. 没有逐行group by的完全相同的查询只会给我第一个值。

Not quite. 不完全的。

The query without group by looks like this: 没有group by的查询如下所示:

select u.name, count(u.name) as 'followers'
from user u, follow f
where u.type = 'c' AND f.followee = u.email

The query uses COUNT() that is a GROUP BY aggregate function . 该查询使用COUNT()这是GROUP BY聚合函数 These functions require the presence of a GROUP BY clause in the query. 这些函数要求查询中存在GROUP BY子句。 However, the SQL standard is tolerant and accepts you query and creates a single group from all the rows filtered by the WHERE clause. 但是,SQL标准是可以容忍的,它接受您的查询并从WHERE子句筛选的所有行中创建一个组。

On the other side, your query without the GROUP BY clause is invalid. 另一方面,没有GROUP BY子句的查询无效。

This is how the GROUP BY queries work: 这是GROUP BY查询的工作方式:

  1. the rows filtered by the WHERE clause are grouped; WHERE子句过滤的行被分组; all the rows from a group have the same value for the first expression present in the GROUP BY clause; 对于GROUP BY子句中存在的第一个表达式,组中的所有行的值都相同;
  2. if the GROUP BY clause contains two or more expressions, each group created on the first step is split into sub-groups using the second expression from the GROUP BY clause; 如果GROUP BY子句包含两个或多个表达式,则使用GROUP BY子句中的第二个表达式将第一步创建的每个组划分为子组;
  3. repeat step 2 for each subsequent expression from the GROUP BY clause, creating nested sub-groups; GROUP BY子句中的每个后续表达式重复步骤2,创建嵌套子组;
  4. one single row is computed from each group created on the previous step; 根据上一步中创建的每个组计算一行。 the values of this row are computed using only the values of the rows contained in the group; 仅使用组中包含的行的值来计算该行的值;

If a column or an expression from the SELECT clause does not use a GROUP BY aggregate function and is not present in the GROUP BY clause then some groups may contain rows having different values for that column/expression; 如果SELECT子句中的列或表达式不使用GROUP BY聚合函数并且在GROUP BY子句中不存在,则某些组可能包含该列/表达式具有不同值的行; this is an error. 这是一个错误。

In order to avoid this to happen, the SQL standard allows in the SELECT clause only expressions that satisfy one of these conditions: 为了避免这种情况的发生,SQL标准在SELECT子句中仅允许满足以下条件之一的表达式:

  1. the expression also appears in the GROUP BY clause; 该表达式也出现在GROUP BY子句中;
  2. the expression is computed using a GROUP BY aggregate function ; 使用GROUP BY聚合函数计算表达式;
  3. all the columns used by the expression are functionally dependent on the columns that appear in the GROUP BY clause. 表达式使用的所有列在功能上都依赖于GROUP BY子句中出现的列。

Let's analyze the expressions in the SELECT clause of your query: 让我们分析查询的SELECT子句中的表达式:

  • u.name - on the initial query it satisfies condition #1; u.name在初始查询中满足条件#1; on the query without GROUP BY it doesn't satisfy any condition. 在没有GROUP BY的查询上,它不满足任何条件。 This makes the query invalid SQL. 这使查询无效的SQL。
  • count(u.name) - it satisfies condition #2 on both versions of the query; count(u.name) -在两个版本的查询中都满足条件2; it doesn't make problems. 它没有问题。

Even if the version of the query without GROUP BY is not valid SQL, up to version 5.7.5, MySQL allows it but it reserves itself the freedom to return indeterminate values for the invalid expressions ( u.name ). 即使不使用GROUP BY的查询版本不是有效的SQL,直到版本5.7.5,MySQL仍允许它,但它保留了为无效表达式返回不确定值u.name )的自由。

A quote from the documentation : 来自文档的报价:

In this case, the server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate, which is probably not what you want. 在这种情况下,服务器可以从每个组中自由选择任何值,因此,除非它们相同,否则选择的值是不确定的,这可能不是您想要的。 Furthermore, the selection of values from each group cannot be influenced by adding an ORDER BY clause. 此外,通过添加ORDER BY子句不会影响从每个组中选择值。

In plain English this means that your query without GROUP BY returns the correct value for followers but the value returned for name can be different on different executions of the same query. 用简单的英语来说,这意味着没有GROUP BY的查询将为followers返回正确的值,但在同一查询的不同执行中,为name返回的值可能会有所不同。 You cannot observe this behaviour if you run the query multiple times but chances are it will happen after you add or remove rows from the table or you backup the table, truncate it then restore it from the backup (or recreate it on a different machine or different version of MySQL). 如果您多次运行查询,您将无法观察到此行为,但是在表中添加或删除行或备份表,截断表然后从备份中还原(或在另一台计算机上重新创建它或不同版本的MySQL)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM