简体   繁体   English

组关闭数字

[英]Group close numbers

I have a table with 2 columns of integers. 我有一个包含2列整数的表。 The first column represents start index and the second column represents end index. 第一列表示起始索引,第二列表示结束索引。

START END
1     8
9     13
14    20
20    25
30    42
42    49
60    67

Simple So far. 简单到目前为止。 What I would like to do is group all the records that follow together: 我想要做的是将所有记录组合在一起:

START END
1     25
30    49
60    67

A record can follow by Starting on the same index as the previous end index or by a margin of 1: 记录可以跟在上一个结束索引的相同索引处开始,或者以1的边距开始:

START END
1     10
10    20

And

START END
1     10
11    20

will both result in 将导致

START END
1     20

I'm using SQL Server 2008 R2. 我正在使用SQL Server 2008 R2。

Any help would be Great 任何帮助都会很棒

Edited to include another version which i think is a bit more reliable, and also works with overlapping ranges 编辑包括另一个我觉得更可靠的版本,也适用于重叠范围

CREATE TABLE #data (start_range INT, end_range INT)
INSERT INTO #data VALUES (1,8) 
INSERT INTO #data VALUES (2,15) 
INSERT INTO #data VALUES (9,13)
INSERT INTO #data VALUES (14,20) 
INSERT INTO #data VALUES (13,26) 
INSERT INTO #data VALUES (12,21) 
INSERT INTO #data VALUES (9,25) 
INSERT INTO #data VALUES (20,25) 
INSERT INTO #data VALUES (30,42) 
INSERT INTO #data VALUES (42,49) 
INSERT INTO #data VALUES (60,67)   

;with ranges as
(
SELECT start_range as level
,end_range as end_range
,row_number() OVER (PARTITION BY (SELECT NULL) ORDER BY start_range) as row
FROM #data
UNION ALL
SELECT
level + 1 as level
,end_range as end_range
,row
From ranges 
WHERE level < end_range
)
,ranges2 AS
(
SELECT DISTINCT 
level
FROM ranges
)
,ranges3 AS
(
SELECT 
level
,row_number() OVER (ORDER BY level) - level as grouping_group
from ranges2
)
SELECT 
MIN(level) as start_number
,MAX(level) as end_number
FROM ranges3
GROUP BY grouping_group
ORDER BY start_number ASC

I think this should work - might not be especially efficient on larger sets though... 我认为这应该有效 - 虽然可能在大型套装上效率不高......

CREATE TABLE #data (start_range INT, end_range INT)
INSERT INTO #data VALUES (1,8)
INSERT INTO #data VALUES (2,15)
INSERT INTO #data VALUES (9,13)
INSERT INTO #data VALUES (14,20)
INSERT INTO #data VALUES (21,25)
INSERT INTO #data VALUES (30,42)
INSERT INTO #data VALUES (42,49)
INSERT INTO #data VALUES (60,67)


;with overlaps as
(
select * 
,end_range - start_range as range
,row_number() OVER (PARTITION BY (SELECT NULL) ORDER BY start_range ASC) as line_number
from #data
)
,overlaps2 AS
(
SELECT
O1.start_range
,O1.end_range
,O1.line_number
,O1.range
,O2.start_range as next_range
,CASE WHEN O2.start_range - O1.end_range < 2 THEN 1 ELSE 0 END as overlap
,O1.line_number - DENSE_RANK() OVER (PARTITION BY (CASE WHEN O2.start_range - O1.end_range < 2 THEN 1 ELSE 0 END) ORDER BY O1.line_number ASC) as overlap_group
FROM overlaps O1
LEFT OUTER JOIN overlaps O2 on O2.line_number = O1.line_number + 1
)
SELECT 
MIN(start_range) as range_start
,MAX(end_range) as range_end
,MAX(end_range) - MIN(start_range) as range_span
FROM overlaps2
GROUP BY overlap_group

This works for your example, let me know if it doesn't work for other data 这适用于您的示例,如果它不适用于其他数据,请告诉我

create table #Range 
(
  [Start] INT,
  [End] INT
)

insert into #Range ([Start], [End]) Values (1, 8)
insert into #Range ([Start], [End]) Values (9, 13)
insert into #Range ([Start], [End]) Values (14, 20)
insert into #Range ([Start], [End]) Values (20, 25)
insert into #Range ([Start], [End]) Values (30, 42)
insert into #Range ([Start], [End]) Values (42, 49)
insert into #Range ([Start], [End]) Values (60, 67)



;with RangeTable as
(select
    t1.[Start],
    t1.[End],
    row_number() over (order by t1.[Start]) as [Index]
from
    #Range t1
where t1.Start not in (select 
                      [End] 
               from
                  #Range
                  Union
               select 
                  [End] + 1
               from
                  #Range
               )
)
select 
    t1.[Start],
    case 
   when t2.[Start] is null then
        (select max([End])
                     from #Range)
       else
        (select max([End])
                     from #Range
                     where t2.[Start] > [End])
end as [End]    
from 
    RangeTable t1
left join 
    RangeTable t2
on
    t1.[Index] = t2.[Index]-1 

drop table #Range;

You could use a number table to solve this problem. 您可以使用数字表来解决此问题。 Basically, you first expand the ranges, then combine subsequent items in groups. 基本上,您首先扩展范围,然后组合后续项目。

Here's one implementation: 这是一个实现:

WITH data (START, [END]) AS (
  SELECT  1,  8 UNION ALL
  SELECT  9, 13 UNION ALL
  SELECT 14, 20 UNION ALL
  SELECT 20, 25 UNION ALL
  SELECT 30, 42 UNION ALL
  SELECT 42, 49 UNION ALL
  SELECT 60, 67
),
expanded AS (
  SELECT DISTINCT
    N = d.START + v.number
  FROM data d
    INNER JOIN master..spt_values v ON v.number BETWEEN 0 AND d.[END] - d.START
  WHERE v.type = 'P'
),
marked AS (
  SELECT
    N,
    SeqID = N - ROW_NUMBER() OVER (ORDER BY N)
  FROM expanded
)
SELECT
  START = MIN(N),
  [END] = MAX(N)
FROM marked
GROUP BY SeqID

This solution uses master..spt_values as a number table, for expanding the initial ranges. 此解决方案使用master..spt_values作为数字表,用于扩展初始范围。 But if (all or some of) those ranges may span more than 2048 (subsequent) values, then you should define and use your own number table. 但如果(全部或部分)这些范围可能超过2048(后续)值,那么您应该定义并使用自己的数字表。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM