简体   繁体   English

Oracle SQL - 这个分组有什么问题?

[英]Oracle SQL — What's wrong with this grouping?

I am trying to grab a row that has the max of some column. 我试图抓住一行有一些列的最大值。 Normally I'd use Rank for this and just select rank = 1 but that seems pointless when I know I just need the max of a column. 通常情况下,我会使用Rank来选择rank = 1但是当我知道我只需要一个列的最大值时,这似乎毫无意义。 Here is my SQL: 这是我的SQL:

SELECT
  name,
  value,
  MAX(version)
FROM
  my_table t
WHERE
  person_type = "STUDENT"
GROUP by NAME,VALUE
HAVING version = max(version)

This returns the "You've done something wrong involving grouping error" ie "not a GROUP BY expression" when trying to run. 这会在尝试运行时返回“你做错了,涉及分组错误”,即“不是GROUP BY表达式”。 If I add version to the group by field, this SQL runs, but it obviously returns all rows instead of just the max version of each. 如果我按类字段添加版本,则此SQL会运行,但它显然会返回所有行,而不仅仅是每个行的最大版本。

So my question is mostly "Why doesn't this work?" 所以我的问题主要是“为什么这不起作用?” I am selecting the max of version so I don't see why I need to group by it. 我正在选择版本的最大值,所以我不明白为什么我需要按它分组。 I know there are other solutions (partition over, rank ...) but I am more interested in why this in particular is flawed syntactically. 我知道还有其他解决方案(分区,排名......)但我更感兴趣的是为什么这在语法上有缺陷。

EDIT: More explicit about the use of this having clause. 编辑:更明确地使用此having子句。

Let's say there are these two rows in table t: 假设表t中有这两行:

NAME    VALUE    VERSION
JEREMY  C        1
JEREMY  A        2

What is returned from this query should be: 从此查询返回的内容应为:

JEREMY A 2

But if I remove having then I would get: 但如果我删除,那么我会得到:

JEREMY A 2
JEREMY C 2

The HAVING clause, in general, needs to contain columns that are produced by the group by. 通常,HAVING子句需要包含由group by生成的列。 In fact, you can think of the HAVING clause as a WHERE on the group by. 实际上,您可以将HAVING子句视为组中的WHERE。

That is, the query: 也就是查询:

select <whatever>
from t
group by <whatever>
having <some condition>

is equivalent to: 相当于:

select <whatever>
from (select <whatever>
      from t
      group by <whatever
     ) t
where <some condition>

If you think about it this way, you'll realize that max(version) makes sense because it is an aggregated value. 如果以这种方式考虑它,你会发现max(版本)是有意义的,因为它是一个聚合值。 However, "version" does not make sense, since it is neither a calculated value nor a group by column. 但是,“版本”没有意义,因为它既不是计算值也不是按列分组。

You seem to know how to fix this. 你似乎知道如何解决这个问题。 The one other comment is that some databases (notably mysql) would accept your syntax. 另一个评论是一些数据库(特别是mysql)会接受你的语法。 They treat "HAVING version = max(version)" as "HAVING any(version) = max(version)". 他们将“HAVING version = max(version)”视为“HAVING any(version)= max(version)”。

You're trying to use version in your HAVING clause, but it's not being grouped by. 你试图在你的HAVING子句中使用version ,但它没有被分组。

If all you want is the name, value and max version, you don't need the HAVING clause at all. 如果您想要的只是名称,值和最大版本,则根本不需要HAVING子句。

SELECT
  name,
  value,
  MAX(version)
FROM
  my_table t
WHERE
  person_type = "STUDENT"
GROUP by NAME,VALUE

The HAVING clause is for when you want to have a "Where" clause after aggregation, like HAVING子句适用于您希望在聚合后具有“Where”子句的情况,例如

HAVING max(version) > 5

EDIT: 编辑:

Based on your sample data, you're grouping by VALUE but what you really want to do is identify the VALUE that has the MAX(VERSION) for each NAME. 根据您的示例数据,您按VALUE进行分组,但您真正想要做的是确定每个NAME具有MAX(VERSION)的VALUE。

To do this, you need to use a WHERE EXISTS or self join, like so: 为此,您需要使用WHERE EXISTS或自联接,如下所示:

select name, value, version from t 
where exists
(
  select 1 from
  (select name, max(version) version
     from t 
    group by name) s
  where s.name = t.name and s.version = t.version
)

This SQL statement fails because the HAVING clause runs after the GROUP BY -- it can only operate on either aggregates or columns that are listed in the GROUP BY clause. 此SQL语句失败,因为HAVING子句在GROUP BY之后运行 - 它只能对GROUP BY子句中列出的聚合或列进行操作。 If you have only grouped by NAME and VALUE , VERSION alone has no meaning-- it has many possible values for every combination of NAME and VALUE at that point so it doesn't make sense to compare it to MAX(version) or any other aggregate which has exactly 1 value for every NAME and VALUE pair. 如果您只按NAMEVALUE分组,则VERSION本身没有任何意义 - 它在此时为NAMEVALUE每个组合提供了许多可能的值,因此将它与MAX(version)或任何其他组合进行比较没有意义对于每个NAMEVALUE对,只有1个值的聚合。

Another way of getting what you want: 得到你想要的另一种方式:

select *
from (select name
        , value
        , version
        , max(version) over 
            (partition by name) as max_version
    from t)
where version = max_version;

Sample execution: SQL> create table t (name varchar2(30) 2 , value varchar2(1) 3 , version number not null 4 , constraint t_pk primary key (name, version)); 示例执行:SQL> create table t(名称varchar2(30)2,值varchar2(1)3,版本号不为null 4,约束t_pk主键(名称,版本));

Table created.

SQL> insert into t select 'JEREMY', 'C', 1 from dual
  2  union all select 'JEREMY', 'A', 2 from dual
  3  union all select 'SARAH', 'D', 2 from dual
  4  union all select 'SARAH', 'X', 1 from dual;

4 rows created.

SQL> commit;

Commit complete.

SQL> select name, value, version
  2  from (select name
  3          , value
  4          , version
  5          , max(version) over
  6              (partition by name) as max_version
  7      from t)
  8  where version = max_version;

NAME                           V    VERSION
------------------------------ - ----------
JEREMY                         A          2
SARAH                          D          2

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM