[英]STRING_AGG in Bigquery
I have a problem with STRING_AGG in Bigquery.我对 Bigquery 中的 STRING_AGG 有疑问。 I'm trying:
我正在努力:
SELECT
id,
institution,
COUNT(DISTINCT institution) OVER (PARTITION BY id) as count_intitution
STRING_AGG(DISTINCT institution,"," ) OVER (PARTITION BY id) as list_intitution
FROM
name_table
WHERE
DATE(created_at) = "2020-02-02"
and i get this error:我得到这个错误:
Analytic function string_agg does not support DISTINCT.
解析 function string_agg 不支持 DISTINCT。
BQ documentation says it allows the use of "DISTINCT" BQ 文档说它允许使用“DISTINCT”
https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-and-operators#string_agg https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-and-operators#string_agg
But apparently it doesn't support "partition by", why?但显然它不支持“分区依据”,为什么?
EDIT:编辑:
the current table is like this (it is an example, the table has more attributes)当前表是这样的(是一个例子,表的属性比较多)
|id |institution|
|1 | a |
|1 | b |
|2 | a |
|2 | c |
|3 | a |
|1 | a |
and what I want to achieve is我想要实现的是
|id|count_institution|list_institution|
|1 |2 |a,b |
|2 |2 |a,c |
|3 |1 |a |
Below is for BigQuery Standard SQL以下是 BigQuery 标准 SQL
#standardSQL
SELECT *
REPLACE((
SELECT STRING_AGG(DISTINCT i) FROM t.list_intitution i
) AS list_intitution
)
FROM (
SELECT
id,
institution,
COUNT(DISTINCT institution) OVER (PARTITION BY id) AS count_intitution,
ARRAY_AGG(institution) OVER (PARTITION BY id) AS list_intitution
FROM
name_table
WHERE
DATE(created_at) = "2020-02-02"
) t
Note: in your original query you just remove DISTINCT and use ARRAY_AGG instead of STRING_AGG, but then in outer query you process this array to form list of distinct values from that array注意:在您的原始查询中,您只需删除 DISTINCT 并使用 ARRAY_AGG 而不是 STRING_AGG,但随后在外部查询中您处理此数组以形成该数组中不同值的列表
Below is answer on your updated question
以下是您更新问题的答案
You can simply use GROUP BY as in below example您可以简单地使用 GROUP BY,如下例所示
#standardSQL
SELECT id,
COUNT(DISTINCT institution) AS count_institution,
STRING_AGG(DISTINCT institution) AS list_institution
FROM name_table
GROUP BY id
If to apply to sample data from your question, as in below example如果适用于您的问题的样本数据,如下例所示
#standardSQL
WITH name_table AS (
SELECT 1 id, 'a' institution UNION ALL
SELECT 1, 'b' UNION ALL
SELECT 2, 'a' UNION ALL
SELECT 2, 'c' UNION ALL
SELECT 3, 'a' UNION ALL
SELECT 1, 'a'
)
SELECT id,
COUNT(DISTINCT institution) AS count_institution,
STRING_AGG(DISTINCT institution) AS list_institution
FROM name_table
GROUP BY id
result is结果是
Row id count_institution list_institution
1 1 2 a,b
2 2 2 a,c
3 3 1 a
You can easily work around this:您可以轻松解决此问题:
SELECT id, institution,
COUNT(DISTINCT institution) OVER (PARTITION BY id) as list_intitution
STRING_AGG(CASE WHEN seqnum = 1 THEN institution END, ',') OVER (PARTITION BY id) as list_intitution
FROM (SELECT t.*,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY id) as seqnum
FROM name_table
WHERE DATE(created_at) = '2020-02-02'
) t
Updated based on your updated question.根据您更新的问题进行了更新。 You could simply not use
window functions
.您可以根本不使用
window functions
。
with cte1 as
(select distinct id, institution
from name_table
where date(created_at) = "2020-02-02")
select id, count(institution) count_inst, string_agg(institution,"," ) list_inst
from cte1
group by id;
Outputs输出
+----+------------+-----------+
| id | count_inst | list_inst |
+----+------------+-----------+
| 1 | 2 | a,b |
| 2 | 2 | a,c |
| 3 | 1 | a |
+----+------------+-----------+
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.