简体   繁体   English

SQL窗口函数将选定距离内的值分组

[英]SQL window function that groups values within selected distance

Is there a function in PostgreSQL, that groups rows with similar value? PostgreSQL中有一个函数可以将具有相似值的行分组? Best would be window function like ST_clusterDBSCAN, which puts together rows within a selected distance. 最好的是像ST_clusterDBSCAN这样的窗口函数,它将选定距离内的行放在一起。 Here is the example: 这是示例:

Group   Value    
A       1    
A       2     
A       2     
A       5    
A       6     
A       10
B       1
B       3

And I am looking for function, that would give me result like this. 我正在寻找功能,那会给我这样的结果。

SELECT group, value, 
       "FUNCTION"(value, 2) OVER (PARTITION BY group) cluster 
FROM mytable

Where second argument (2) means maximum range between values, that could be in one cluster. 第二个参数(2)表示值之间的最大范围,该范围可以在一个群集中。

Group   Value   Cluster   
A       1       1 
A       2       1
A       2       1
A       5       2
A       6       2
A       10      3
B       1       1
B       3       1

Try this; 尝试这个; it is the approach dnoeth suggested. 这是dnoeth建议的方法。 I'm calling your test data set "temp", and I renamed the group column "agroup". 我将测试数据集称为“ temp”,并将组列重命名为“ agroup”。 You can change the threshold (the difference you're looking for) by changing the right side of the inequality, and you may want to change the sorting in your real data. 您可以通过更改不等式的右侧来更改阈值(所要查找的差异),并且可能需要更改真实数据中的排序。 BTW, the range unbounded statement is the default for any window, so is not really necessary but I left it in for clarity. 顺便说一句,范围无界声明是任何窗口的默认设置,因此并不是必须的,但是为了清楚起见,我将其保留了下来。

WITH step1 AS (
SELECT t.*, CASE WHEN (value - lag(value,1) OVER w) > 1 THEN 1 ELSE 0 END AS aflag 
FROM temp t WINDOW w AS (ORDER BY agroup,value))
SELECT s.agroup,s.value,sum(aflag) OVER w2 + 1 AS cluster 
FROM step1 s WINDOW w2 AS (PARTITION BY agroup ORDER BY agroup,value RANGE UNBOUNDED PRECEDING)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM