[英]Find the start and end date (set based) in T-SQL
I have the below. 我有以下内容。
Name Date
A 2011-01-01 01:00:00.000
A 2011-02-01 02:00:00.000
A 2011-03-01 03:00:00.000
B 2011-04-01 04:00:00.000
A 2011-05-01 07:00:00.000
The desired output is 期望的输出是
Name StartDate EndDate
-------------------------------------------------------------------
A 2011-01-01 01:00:00.000 2011-04-01 04:00:00.000
B 2011-04-01 04:00:00.000 2011-05-01 07:00:00.000
A 2011-05-01 07:00:00.000 NULL
How to achieve the same using TSQL in a set based approach. 如何在基于集合的方法中使用TSQL实现相同的目的。
DECLARE @t TABLE(PersonName VARCHAR(32), [Date] DATETIME)
INSERT INTO @t VALUES('A', '2011-01-01 01:00:00')
INSERT INTO @t VALUES('A', '2011-01-02 02:00:00')
INSERT INTO @t VALUES('A', '2011-01-03 03:00:00')
INSERT INTO @t VALUES('B', '2011-01-04 04:00:00')
INSERT INTO @t VALUES('A', '2011-01-05 07:00:00')
Select * from @t
;WITH cte1
AS (SELECT *,
ROW_NUMBER() OVER (ORDER BY Date) -
ROW_NUMBER() OVER (PARTITION BY PersonName
ORDER BY Date) AS G
FROM @t),
cte2
AS (SELECT PersonName,
MIN([Date]) StartDate,
ROW_NUMBER() OVER (ORDER BY MIN([Date])) AS rn
FROM cte1
GROUP BY PersonName,
G)
SELECT a.PersonName,
a.StartDate,
b.StartDate AS EndDate
FROM cte2 a
LEFT JOIN cte2 b
ON a.rn + 1 = b.rn
Because the result of CTEs are not generally materialised however you may well find you get better performance if you materialize the intermediate result yourself as below. 由于一般不会实现CTE的结果,但是,如果您自己按以下方式实现中间结果,您可能会发现性能会更好。
DECLARE @t2 TABLE (
rn INT IDENTITY(1, 1) PRIMARY KEY,
PersonName VARCHAR(32),
StartDate DATETIME );
INSERT INTO @t2
SELECT PersonName,
MIN([Date]) StartDate
FROM (SELECT *,
ROW_NUMBER() OVER (ORDER BY Date) -
ROW_NUMBER() OVER (PARTITION BY PersonName
ORDER BY Date) AS G
FROM @t) t
GROUP BY PersonName,
G
ORDER BY StartDate
SELECT a.PersonName,
a.StartDate,
b.StartDate AS EndDate
FROM @t2 a
LEFT JOIN @t2 b
ON a.rn + 1 = b.rn
The other answer with the cte is a good one. CTE的另一个答案是一个很好的答案。 Another option would be to iterate over the collection in any case.
另一种选择是在任何情况下都对集合进行迭代。 It's not set based, but it is another way to do it.
它不是基于设置的,而是另一种实现方法。
You will need to iterate to either A. assign a unique id to each record that corresponds to its transaction, or B. to actually get your output. 您将需要迭代A.为与它的交易相对应的每条记录分配一个唯一的ID,或者B.实际获取输出。
TSQL is not ideal for iterating over records, especially if you have a lot, and so I would recommend some other way of doing it, a small .net program or something that is better at iterating. TSQL不是迭代记录的理想选择,尤其是在您有很多记录的情况下,因此我建议您使用其他方法,例如小型.net程序或迭代效果更好的方法。
Get a row number so you will know where the previous record is. 获取行号,以便您知道上一条记录在哪里。 Then, take a record and the next record after it.
然后,记录并在其后的下一个记录。 When the state changes we have a candidate row.
当状态改变时,我们会有一个候选行。
select
state,
min(start_timestamp),
max(end_timestamp)
from
(
select
first.state,
first.timestamp_ as start_timestamp,
second.timestamp_ as end_timestamp
from
(
select
*, row_number() over (order by timestamp_) as id
from test
) as first
left outer join
(
select
*, row_number() over (order by timestamp_) as id
from test
) as second
on
first.id = second.id - 1
and first.state != second.state
) as agg
group by state
having max(end_timestamp) is not null
union
-- last row wont have a ending row
--(select state, timestamp_, null from test order by timestamp_ desc limit 1)
-- I think it something like this for sql server
(select top state, timestamp_, null from test order by timestamp_ desc)
order by 2
;
Tested with PostgreSQL but should work with SQL Server as well 经过PostgreSQL测试,但也应与SQL Server一起使用
SELECT
PersonName,
StartDate = MIN(Date),
EndDate
FROM (
SELECT
PersonName,
Date,
EndDate = (
/* get the earliest date after current date
associated with a different person */
SELECT MIN(t1.Date)
FROM @t AS t1
WHERE t1.Date > t.Date
AND t1.PersonName <> t.PersonName
)
FROM @t AS t
) s
GROUP BY PersonName, EndDate
ORDER BY 2
Basically, for every Date
we find the nearest date after it such that is associated with a different PersonName
. 基本上,对于每个
Date
我们都会在它之后找到与另一个PersonName
关联的最近的日期。 That gives us EndDate
, which now distinguishes for us consecutive groups of dates for the same person. 这给了我们
EndDate
,现在它可以为我们区分同一个人的连续日期组。
Now we only need to group the data by PersonName
& EndDate
and get the minimal Date
in every group as StartDate
. 现在,我们只需
PersonName
和EndDate
对数据进行分组,并在每个组中将最小的Date
作为StartDate
。 And yes, sort the data by StartDate
, of course. 是的,当然可以按
StartDate
对数据进行排序。
There's a very quick way to do this using a bit of Gaps and Islands theory: 有一些使用缺口和离岛理论的快速方法:
WITH CTE as (SELECT PersonName, [Date]
, Row_Number() over (ORDER BY [Date])
- Row_Number() over (ORDER BY PersonName, [Date]) as Island
FROM @t)
Select PersonName, Min([Date]), Max([Date])
from CTE
GROUP BY Island, PersonName
ORDER BY Min([Date])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.