[英]SQL: how can I exclude certain lines from an aggregated result?
In the query I built, the result shows something like below:在我构建的查询中,结果显示如下:
SELECT name
,ARRAY_AGG(fruits ORDER BY time ASC) AS all_fruits
FROM table_fruits
name![]() |
all_fruits ![]() |
---|---|
Person A![]() |
Apple, Banana, Apple, Apple, Apple, Apple![]() |
Person B![]() |
Apple, Apple, Apple, Banana, Apple, Banana![]() |
Person C ![]() |
Banana, Banana, Apple, Banana, Apple, Apple![]() |
I want to add one more column which shows the count of apples.我想再添加一列显示苹果的数量。 However, I do not want to count apples that are followed by bananas.
但是,我不想数苹果之后是香蕉。 Therefore, the additional column should look like below.
因此,附加列应如下所示。
name![]() |
all_fruits ![]() |
count_of_apple ![]() |
---|---|---|
Person A![]() |
Apple, Banana, Apple, Apple, Apple, Apple![]() |
4 ![]() |
Person B![]() |
Apple, Apple, Apple, Banana, Apple, Banana![]() |
2 ![]() |
Person C ![]() |
Banana, Banana, Apple, Banana, Apple, Apple![]() |
2 ![]() |
How would I do this in SQL?我将如何在 SQL 中执行此操作? The source includes time for when the fruit was eaten.
来源包括食用水果的时间。
You can check:您可以检查:
LEAD
window functionLEAD
窗口函数该行后面的“水果”值是什么COALESCE
function will replace this NULL value with the current " fruits " valueCOALESCE
函数将用当前的“ fruits ”值替换这个 NULL 值"Apple"
and your next value is not "Banana"
, inside a CASE
statement"Apple"
并且您的下一个值不是"Banana"
时,您可以在CASE
语句中为新列分配 1SELECT *,
CASE WHEN fruits = 'Apple'
AND COALESCE(LEAD(fruits) OVER(
PARTITION BY name
ORDER BY time),
fruits) <> 'Banana'
THEN 1
END AS apples_not_after_bananas
FROM table_fruits
After this step, you can use your own code and add在这一步之后,您可以使用自己的代码并添加
GROUP BY
clause you missed, to aggregate over the " name " fieldGROUP BY
子句,用于聚合“名称”字段SUM
aggregation function over the previously generated 1
s when apples were not followed by bananas.SUM
聚合函数在先前生成的1
秒内。WITH cte AS (
SELECT *,
CASE WHEN fruits = 'Apple'
AND COALESCE(LEAD(fruits) OVER(
PARTITION BY name
ORDER BY time),
fruits) <> 'Banana'
THEN 1
END AS apples_not_after_bananas
FROM table_fruits
)
SELECT name,
ARRAY_AGG(fruits ORDER BY time ASC) AS all_fruits,
SUM(apples_not_after_bananas) AS count_of_apple
FROM cte
GROUP BY name
Edit : the banana came more than 1 day later编辑:香蕉在 1 天后才来
If you want to add this specific condition, or in general any conditions, you need to work inside the CASE statement, which currently has two conditions, one on the current fruit and one on the next fruit.如果要添加此特定条件或一般任何条件,则需要在 CASE 语句中工作,该语句当前有两个条件,一个针对当前水果,一个针对下一个水果。
Checking whether the banana came more than 1 day later just means to add something like this:检查香蕉是否在 1 天后到达只是意味着添加如下内容:
CASE WHEN fruits = 'Apple'
AND COALESCE(LEAD(fruits) OVER(
PARTITION BY name
ORDER BY time),
fruits) <> 'Banana'
--AND <if difference between the current next time value is greater than 1 day>
THEN 1
END AS apples_not_after_bananas
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.