简体   繁体   English

来自数组的不同值?

[英]Distinct values from an array?

Following tables: 下表:

CREATE TEMPORARY TABLE guys ( guy_id integer primary key, guy text );
CREATE TEMPORARY TABLE sales ( log_date date, sales_guys integer[], sales smallint );
INSERT INTO guys VALUES(1,'john'),(2,'joe');
INSERT INTO sales VALUES('2016-01-01', '{1,2}', 2),('2016-01-02','{1,2}',4);

Following query works great to show names on a given date: 以下查询非常适合显示给定日期的名称:

SELECT log_date, sales_guys, ARRAY_AGG(guy), sales 
FROM sales 
JOIN guys ON 
   guys.guy_id = ANY(sales.sales_guys) 
GROUP BY log_date, sales_guys, sales 
ORDER BY log_date ASC;

  log_date  | sales_guys | array_agg  | sales 
------------+------------+------------+-------
 2016-01-01 | {1,2}      | {john,joe} |     2
 2016-01-02 | {1,2}      | {john,joe} |     4

Following query problematically gives me a name per date per guy, so here each name twice, and so on): 以下查询有问题地给我每个人每个日期的名字,因此这里每个名字两次,依此类推):

SELECT sales_guys, ARRAY_AGG(guy), SUM(sales) AS sales
FROM sales
JOIN guys ON guys.guy_id = ANY(sales.sales_guys)
GROUP BY sales_guys;

Yields: 产量:

 sales_guys |      array_agg      | sales 
------------+---------------------+-------
 {1,2}      | {john,joe,john,joe} |    12

Is there a way to somehow reduce the ARRAY_AGG call to give only the unique names? 有什么方法可以减少ARRAY_AGG调用以仅给出唯一名称吗?

您可以在聚合内使用DISTINCT

SELECT sales_guys, ARRAY_AGG(DISTINCT guy), SUM(sales) AS sales FROM sales JOIN guys ON guys.guy_id = ANY(sales.sales_guys) GROUP BY sales_guys;

There is no kind of order you can trust without ORDER BY . 没有ORDER BY您无法信任任何ORDER BY Except that elements of arrays, when unnested, come in array order. 除了数组元素在未嵌套时按数组顺序排列。 If your query does more with the result, it may be re-ordered, though. 但是,如果查询对结果的影响更大,则可能会对其进行重新排序。

You an simply add ORDER BY to any aggregate function in Postgres: 您只需将ORDER BY添加到Postgres中的任何聚合函数中:

SELECT s.sales_guys, ARRAY_AGG(DISTINCT g.guy ORDER BY g.guy) AS names, SUM(s.sales) AS sum_sales
FROM   sales s
JOIN   guys  g ON g.guy_id = ANY(s.sales_guys)
GROUP  BY s.sales_guys;

But that's obviously not the original order of array elements. 但这显然不是数组元素的原始顺序。 And the query has other issues ... Neither IN nor = ANY() care about order of elements in the set, list or array on the right side: 而且查询还有其他问题... IN= ANY()都不关心右侧集合,列表或数组中元素的顺序:

Proper solution 正确的解决方案

For this task (attention to the details!): 对于此任务(请注意细节!):

Get the total sales per array sales_guys , where the order of elements makes a difference (arrays '{1,2}' and '{2,1}' are not the same) and sales_guys has neither duplicate nor NULL elements. 获取每个数组sales_guys的总sales ,其中元素的顺序有所不同(数组'{1,2}''{2,1}'不相同),而sales_guys既没有重复元素也没有NULL。 Add an array of resolved names in matching order. 按匹配顺序添加解析名称数组。

Use unnest() with WITH ORDINALITY . unnest()WITH ORDINALITY一起WITH ORDINALITY and aggregate arrays before you resolve names, that's cheaper and less error prone. 解析名称之前聚合数组,这样更便宜且更不易出错。

SELECT s.*, g.
FROM  (
   SELECT sales_guys, sum (sales) AS total_sales                -- aggregate first in subquery
   FROM   sales
   GROUP  BY 1
   ) s
, LATERAL (
   SELECT array_agg(guy ORDER BY ord) AS names                  -- order by original order
   FROM   unnest(s.sales_guys) WITH ORDINALITY sg(guy_id, ord)  -- with order of elements
   LEFT   JOIN guys g USING (guy_id)                            -- LEFT JOIN to add NULL for missing guy_id
   ) g;

The LATERAL subquery can be joined with unconditional CROSS JOIN - comma ( , ) is shorthand notation - because the aggregate in the subquery guarantees a result for every row. 可以使用无条件CROSS JOIN LATERAL子查询-逗号( , )是速记符号-因为子查询中的聚合保证每一行都有结果。 Else you'd use LEFT JOIN LATERAL .. ON true . 否则,您将使用LEFT JOIN LATERAL .. ON true

Detailed explanation: 详细说明:

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM