简体   繁体   English

如何使用GROUP BY子句将查询移植到PostgreSQL?

[英]How do I port query with GROUP BY clause to PostgreSQL?

I'm porting a simple expense database to Postgres and got stuck on a view using GROUP BY and multiple JOIN clauses. 我正在将一个简单的费用数据库移植到Postgres,并使用GROUP BY和多个JOIN子句卡在视图上。 I think Postgres wants me to use all the tables in the GROUP BY clause. 我认为Postgres希望我使用GROUP BY子句中的所有表。

Table definition is at the end. 表定义在最后。 Note that columns account_id , receiving_account_id and place may be NULL and an operation can have 0 tags. 请注意, account_idreceiving_account_idplace列可能为NULL并且一个operation可以具有0个标记。

Original CREATE statement 原始的CREATE语句

CREATE VIEW details AS SELECT
    op.id,
    op.name,
    c.name,
    CASE --amountsign
        WHEN op.receiving_account_id IS NOT NULL THEN
            CASE
                WHEN op.account_id IS NULL THEN '+'
                ELSE '='
            END
        ELSE '-' 
    END || ' ' || printf("%.2f", op.amount) || ' zł' AS amount,
    CASE --account
        WHEN op.receiving_account_id IS NOT NULL THEN
            CASE
                WHEN op.account_id IS NULL THEN ac2.name
                ELSE ac.name || ' -> ' || ac2.name
            END
        ELSE ac.name
    END AS account,
    t.name AS type,
    CASE --date
        WHEN op.time IS NOT NULL THEN op.date || ' ' || op.time
        ELSE op.date
    END AS date,
    p.name AS place,
    GROUP_CONCAT(tag.name, ', ') AS tags
FROM operation op
LEFT JOIN category c ON op.category_id = c.id
LEFT JOIN type t ON op.type_id = t.id
LEFT JOIN account ac ON op.account_id = ac.id
LEFT JOIN account ac2 ON op.receiving_account_id = ac2.id
LEFT JOIN place p ON op.place_id = p.id
LEFT JOIN operation_tag ot ON op.id = ot.operation_id
LEFT JOIN tag ON ot.tag_id = tag.id
GROUP BY IFNULL (ot.operation_id, op.id)
ORDER BY date DESC

Current query in Postgres Postgres中的当前查询

I made some updates and my current statement is: 我进行了一些更新,目前的说法是:

BEGIN TRANSACTION;
CREATE VIEW details AS SELECT
    op.id,
    op.name,
    c.name,
    CASE --amountsign
        WHEN op.receiving_account_id IS NOT NULL THEN
            CASE
                WHEN op.account_id IS NULL THEN '+'
                ELSE '='
            END
        ELSE '-' 
    END || ' ' || op.amount || ' zł' AS amount,
    CASE --account
        WHEN op.receiving_account_id IS NOT NULL THEN
            CASE
                WHEN op.account_id IS NULL THEN ac2.name
                ELSE ac.name || ' -> ' || ac2.name
            END
        ELSE ac.name
    END AS account,
    t.name AS type,
    CASE --date
        WHEN op.time IS NOT NULL THEN to_char(op.date, 'DD.MM.YY') || ' ' || op.time
        ELSE to_char(op.date, 'DD.MM.YY')
    END AS date,
    p.name AS place,
    STRING_AGG(tag.name, ', ') AS tags
FROM operation op
LEFT JOIN category c ON op.category_id = c.id
LEFT JOIN type t ON op.type_id = t.id
LEFT JOIN account ac ON op.account_id = ac.id
LEFT JOIN account ac2 ON op.receiving_account_id = ac2.id
LEFT JOIN place p ON op.place_id = p.id
LEFT JOIN operation_tag ot ON op.id = ot.operation_id
LEFT JOIN tag ON ot.tag_id = tag.id
GROUP BY COALESCE (ot.operation_id, op.id)
ORDER BY date DESC;
COMMIT;

Here I get Column 'x' must appear in GROUP BY clause errors as I add listed ones: 在这里,我得到的Column 'x' must appear in GROUP BY clause添加列出的Column 'x' must appear in GROUP BY clause错误中:

GROUP BY COALESCE(ot.operation_id, op.id), op.id, c.name, ac2.name, ac.name, t.name, p.name

When I add p.name column I get Column 'p.name' is defined more than once error. 当我添加p.name列时,我得到Column 'p.name' is defined more than once error. How do I fix that? 我该如何解决?

Table definition 表定义

CREATE TABLE operation (
  id integer NOT NULL PRIMARY KEY,
  name character varying(64) NOT NULL,
  category_id integer NOT NULL,
  type_id integer NOT NULL,
  amount numeric(8,2) NOT NULL,
  date date NOT NULL,
  "time" time without time zone NOT NULL,
  place_id integer,
  account_id integer,
  receiving_account_id integer,
  CONSTRAINT categories_transactions FOREIGN KEY (category_id)
      REFERENCES category (id) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE NO ACTION,
  CONSTRAINT transactions_accounts FOREIGN KEY (account_id)
      REFERENCES account (id) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE NO ACTION,
  CONSTRAINT transactions_accounts_second FOREIGN KEY (receiving_account_id)
      REFERENCES account (id) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE NO ACTION,
  CONSTRAINT transactions_places FOREIGN KEY (place_id)
      REFERENCES place (id) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE NO ACTION,
  CONSTRAINT transactions_transaction_types FOREIGN KEY (type_id)
      REFERENCES type (id) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE NO ACTION
);

Like @Andomar already provided : Most RDBMS require to group by every column that appears unaggregated - anywhere else in the query (including the SELECT list, but also in the WHERE clause etc.) 就像已经提供的@Andomar一样:大多数RDBMS要求按未聚集的每一列进行分组-查询中的其他任何位置(包括SELECT列表,还包括WHERE子句等)。

The SQL standard also defines that expressions in the GROUP BY clause shall also cover functionally dependent expressions. SQL标准还定义了GROUP BY子句中的表达式也应涵盖功能相关的表达式。 Postgres implemented that the PK column covers all columns of the same table . Postgres实现了PK列覆盖同一表的所有列

So op.id covers the whole table and this should work for your current query: 因此op.id涵盖了整个表格,这应该适用于您当前的查询:

GROUP BY op.id, c.name, 5, t.name, p.name

5 being a positional reference to the SELECT list, which is also allowed in Postgres. 5是对SELECT列表的位置引用 ,Postgres也允许使用。 It's just notational shorthand for repeating the long expression: 这只是重复长表达的简写形式:

CASE
   WHEN op.receiving_account_id IS NOT NULL THEN
      CASE
         WHEN op.account_id IS NULL THEN ac2.name
         ELSE ac.name || ' -> ' || ac2.name
      END
   ELSE ac.name
END

I derive from your names that you have an:m relationship between operation and tag , implemented with operation_tag . 我从您的名字中得出,您在operationtag之间具有一个:: m关系,并通过operation_tag实现。 All other joins don't seem to multiply rows, so it would be more efficient to aggregate tags separately - like @Andomar hinted, just get the logic right. 所有其他联接似乎都没有乘行,因此单独聚合标签会更有效-就像@Andomar暗示的那样,只需弄清楚逻辑即可。

This should work: 这应该工作:

SELECT op.id
     , op.name
     , c.name
     , CASE  -- amountsign
          WHEN op.receiving_account_id IS NOT NULL THEN
             CASE WHEN op.account_id IS NULL THEN '+' ELSE '=' END
          ELSE '-' 
       END || ' ' || op.amount || ' zł' AS amount
     , CASE  -- account
          WHEN op.receiving_account_id IS NOT NULL THEN
             CASE
                WHEN op.account_id IS NULL THEN ac2.name
                ELSE ac.name || ' -> ' || ac2.name
             END
          ELSE ac.name
       END AS account
     , t.name AS type
     , to_char(op.date, 'DD.MM.YY') || ' ' || op.time AS date  -- see below
     , p.name AS place
     , ot.tags
FROM   operation op
LEFT   JOIN category c   ON op.category_id = c.id
LEFT   JOIN type     t   ON op.type_id = t.id
LEFT   JOIN account  ac  ON op.account_id = ac.id
LEFT   JOIN account  ac2 ON op.receiving_account_id = ac2.id
LEFT   JOIN place    p   ON op.place_id = p.id
LEFT JOIN ( SELECT operation_id, string_agg(t.name, ', ') AS tags FROM operation_tag ot LEFT JOIN tag t ON t.id = ot.tag_id GROUP BY 1 ) ot ON op.id = ot.operation_id
ORDER BY op.date DESC, op.time DESC;

Asides 旁白

You can replace: 您可以替换:

CASE --date
   WHEN op.time IS NOT NULL THEN to_char(op.date, 'DD.MM.YY') || ' ' || op.time
   ELSE to_char(op.date, 'DD.MM.YY')
END AS date

with this shorter equivalent: 具有以下较短的等效项:

concat_ws(' ', to_char(op.date, 'DD.MM.YY'), op.time) AS date

But since both columns are defined NOT NULL , you can furher simplify to: 但是,由于这两列均定义为NOT NULL ,因此您可以进一步简化为:

to_char(op.date, 'DD.MM.YY') || ' ' || op.time AS date

Careful with your ORDER BY you have at least one input column also named date . 小心您的ORDER BY您至少有一个输入列,也称为date If you use the unqualified name, it will refer to the output column - which is what you want (as clarified in the comment). 如果您使用非限定名称,它将引用输出列-这就是您想要的(在注释中已阐明)。 Details: 细节:

However , sorting by the text representation would not sort according to your timeline correctly. 但是 ,按文本表示法排序不会正确地根据您的时间轴排序。 Sort by original values instead as suggested in my query above. 按原始值排序,而不是上面我的查询中所建议的。

Most databases require you to group by every column that appears unaggregated in the select . 大多数数据库要求您group by select中未显示group by每一列进行group by Unaggregated means not wrapped in an aggregate like min , max or string_agg . 未聚合表示未包装在诸如minmaxstring_agg类的聚合中。 So you'd need to group on: op.id, op.name, c.name, op.receiving_account_id, ... , etc. 因此,您需要分组: op.id, op.name, c.name, op.receiving_account_id, ...等。

The reason for this requirement is that the database has to determine a value for the group. 此要求的原因是数据库必须确定该组的值。 By adding the column to the group by clause, you confirm that every row in the group has the same value. 通过将列添加到group by子句,可以确认组中的每一行都具有相同的值。 For other groups, you must specify which value to use with an aggregate. 对于其他组,必须指定要用于汇总的值。 The exception is MySQL, which just picks a arbitrary value if you don't make a conscious choice. MySQL是一个例外,它会在您没有做出明智选择的情况下选择任意值。

If your group by is just to create a list of tags, you could move that to a subquery: 如果您的group by仅用于创建标签列表,则可以将其移至子查询:

left join
        (
        select  id
        ,       string_agg(tag.name, ', ') tags
        from    tag
        group by
                id
        ) t
on      ot.tag_id = t.id

And you can avoid a very long group by for the outer query. 而且您可以避免对外部查询进行非常长的分组。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何在此查询中运行GROUP BY子句? - How do I make the GROUP BY clause in this query run? 在以下查询中,如何在where子句中使用“ group”进行比较? - How do I use the `group` in where clause for the comparison in following query? 如何在具有Group By子句的SQL查询的where子句中使用算术表达式,而不会出现算术溢出? - How do I use an arithmetic expression in the where clause of a SQL query that has a Group By clause without getting arithmetic overflow? 如何查询复合GROUP BY子句中最后一列的多个值的记录? - How do I query for records with multiple values for the last column in a compound GROUP BY clause? 如何在LINQ中的复杂group by上的聚合上使用having子句并在LINQ中加入查询 - How do I use the having clause on an aggregate in a complex group by and join query in LINQ 您如何在子查询中使用group by / having子句? - How do you use a group by / having clause in a sub-query for? 如何将数组传递给 PostgreSQL 中的 where 子句? - How do I pass an array to where clause in PostgreSQL? 如何在group by子句中支持两个条件(PostgreSQL) - How to support two conditions in the group by clause (postgresql) 如何在db2中使用group by子句执行COUNT(*)? - How do I do COUNT(*) with a group by clause in db2? 如何对GROUP BY子句生成的COUNT(名称)执行WHERE? - How do I do a WHERE on the COUNT(name) produced by a GROUP BY clause?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM