[英]Distinct values from an array?
Following tables: 下表:
CREATE TEMPORARY TABLE guys ( guy_id integer primary key, guy text );
CREATE TEMPORARY TABLE sales ( log_date date, sales_guys integer[], sales smallint );
INSERT INTO guys VALUES(1,'john'),(2,'joe');
INSERT INTO sales VALUES('2016-01-01', '{1,2}', 2),('2016-01-02','{1,2}',4);
Following query works great to show names on a given date: 以下查询非常适合显示给定日期的名称:
SELECT log_date, sales_guys, ARRAY_AGG(guy), sales
FROM sales
JOIN guys ON
guys.guy_id = ANY(sales.sales_guys)
GROUP BY log_date, sales_guys, sales
ORDER BY log_date ASC;
log_date | sales_guys | array_agg | sales
------------+------------+------------+-------
2016-01-01 | {1,2} | {john,joe} | 2
2016-01-02 | {1,2} | {john,joe} | 4
Following query problematically gives me a name per date per guy, so here each name twice, and so on): 以下查询有问题地给我每个人每个日期的名字,因此这里每个名字两次,依此类推):
SELECT sales_guys, ARRAY_AGG(guy), SUM(sales) AS sales
FROM sales
JOIN guys ON guys.guy_id = ANY(sales.sales_guys)
GROUP BY sales_guys;
Yields: 产量:
sales_guys | array_agg | sales
------------+---------------------+-------
{1,2} | {john,joe,john,joe} | 12
Is there a way to somehow reduce the ARRAY_AGG
call to give only the unique names? 有什么方法可以减少
ARRAY_AGG
调用以仅给出唯一名称吗?
您可以在聚合内使用DISTINCT
:
SELECT sales_guys, ARRAY_AGG(DISTINCT guy), SUM(sales) AS sales FROM sales JOIN guys ON guys.guy_id = ANY(sales.sales_guys) GROUP BY sales_guys;
There is no kind of order you can trust without ORDER BY
. 没有
ORDER BY
您无法信任任何ORDER BY
。 Except that elements of arrays, when unnested, come in array order. 除了数组元素在未嵌套时按数组顺序排列。 If your query does more with the result, it may be re-ordered, though.
但是,如果查询对结果的影响更大,则可能会对其进行重新排序。
You an simply add ORDER BY
to any aggregate function in Postgres: 您只需将
ORDER BY
添加到Postgres中的任何聚合函数中:
SELECT s.sales_guys, ARRAY_AGG(DISTINCT g.guy
ORDER BY g.guy) AS names, SUM(s.sales) AS sum_sales
FROM sales s
JOIN guys g ON g.guy_id = ANY(s.sales_guys)
GROUP BY s.sales_guys;
But that's obviously not the original order of array elements. 但这显然不是数组元素的原始顺序。 And the query has other issues ... Neither
IN
nor = ANY()
care about order of elements in the set, list or array on the right side: 而且查询还有其他问题...
IN
和= ANY()
都不关心右侧集合,列表或数组中元素的顺序:
For this task (attention to the details!): 对于此任务(请注意细节!):
Get the total sales
per array sales_guys
, where the order of elements makes a difference (arrays '{1,2}'
and '{2,1}'
are not the same) and sales_guys
has neither duplicate nor NULL elements. 获取每个数组
sales_guys
的总sales
,其中元素的顺序有所不同(数组'{1,2}'
和'{2,1}'
不相同),而sales_guys
既没有重复元素也没有NULL。 Add an array of resolved names in matching order. 按匹配顺序添加解析名称数组。
Use unnest()
with WITH ORDINALITY
. 将
unnest()
与WITH ORDINALITY
一起WITH ORDINALITY
。 and aggregate arrays before you resolve names, that's cheaper and less error prone. 并在解析名称之前聚合数组,这样更便宜且更不易出错。
SELECT s.*, g.
FROM (
SELECT sales_guys, sum (sales) AS total_sales -- aggregate first in subquery
FROM sales
GROUP BY 1
) s
, LATERAL (
SELECT array_agg(guy ORDER BY ord) AS names -- order by original order
FROM unnest(s.sales_guys) WITH ORDINALITY sg(guy_id, ord) -- with order of elements
LEFT JOIN guys g USING (guy_id) -- LEFT JOIN to add NULL for missing guy_id
) g;
The LATERAL
subquery can be joined with unconditional CROSS JOIN
- comma ( ,
) is shorthand notation - because the aggregate in the subquery guarantees a result for every row. 可以使用无条件
CROSS JOIN
LATERAL
子查询-逗号( ,
)是速记符号-因为子查询中的聚合保证每一行都有结果。 Else you'd use LEFT JOIN LATERAL .. ON true
. 否则,您将使用
LEFT JOIN LATERAL .. ON true
。
Detailed explanation: 详细说明:
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.