[英]SQL query to find all combinations of grouped values
I am looking for a SQL query or a series of SQL queries. 我正在寻找一个SQL查询或一系列SQL查询。
id
, event_type
, and timestamp
id
, event_type
和timestamp
event_type
. event_type
值。 representing a flow of events associated with the same ID What I want to do is to query the number of distinct combinations of event types (sorted by timestamp). 我想做的是查询事件类型的不同组合数(按时间戳排序)。 For example, provided this table:
例如,提供此表:
id event_type timestamp
-----------------------------------------
foo event_1 101
foo event_2 102
bar event_2 102
bar event_1 101
foo event_3 103
bar event_3 103
blah event_1 101
bleh event_2 102
backwards event_1 103
backwards event_2 102
backwards event_3 101
Then I should get the following result: 然后我应该得到以下结果:
combination count
-------------------------------
[event_1,event_2,event_3] 2 // foo and bar
[event_3,event_2,event_1] 1 // backwards
[event_1] 1 // blah
[event_2] 1 // bleh
You can do 2 levels of grouping to your data. 您可以对数据进行2级分组。
For Mysql use group_concat()
: 对于Mysql使用
group_concat()
:
select t.combination, count(*) count
from (
select
group_concat(event_type order by timestamp) combination
from tablename
group by id
) t
group by t.combination
order by count desc
See the demo . 参见演示 。
For Postgresql use array_agg()
with array_to_string()
: 对于Postgresql,请使用
array_agg()
和array_to_string()
:
select t.combination, count(*) count
from (
select
array_to_string(array_agg(event_type order by timestamp), ',') combination
from tablename
group by id
) t
group by t.combination
order by count desc
See the demo . 参见演示 。
For Oracle there is listagg()
: 对于Oracle,有
listagg()
:
select t.combination, count(*) count
from (
select
listagg(event_type, ',') within group (order by timestamp) combination
from tablename
group by id
) t
group by t.combination
order by count desc
See the demo . 参见演示 。
For SQL Server 2017+ there is string_agg()
: 对于SQL Server 2017+,有
string_agg()
:
select t.combination, count(*) count
from (
select
string_agg(event_type, ',') within group (order by timestamp) combination
from tablename
group by id
) t
group by t.combination
order by count desc
See the demo . 参见演示 。
Results: 结果:
| combination | count |
| ----------------------- | ----- |
| event_1,event_2,event_3 | 2 |
| event_3,event_2,event_1 | 1 |
| event_1 | 1 |
| event_2 | 1 |
SELECT
"combi"."combination",
COUNT(*) AS "count"
FROM
(
SELECT
GROUP_CONCAT("event_type" SEPARATOR ',') AS "combination"
FROM
?table?
GROUP BY
"id"
) AS "combi"
GROUP BY
"combi"."combination"
Note: GROUP_CONCAT(... SEPARATOR ...)
syntax is not SQL standard, it's DB specific (in this case MySQL, other dbs have other aggregate functions). 注意:
GROUP_CONCAT(... SEPARATOR ...)
语法不是SQL标准,而是特定于数据库的(在这种情况下,MySQL,其他数据库具有其他聚合函数)。 You might need to adjust for your DB of choice or specify in tags which DB you are actually using. 您可能需要根据选择的数据库进行调整,或者在标签中指定实际使用的数据库。
As for "sorted by timestamp", you need to define what this actually means. 至于“按时间戳排序”,则需要定义其实际含义。 What is "sorted by timestamp" for a group of groups?
一组组的“按时间戳排序”是什么?
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.